P-Value
The probability of observing your data (or something more extreme) if there were truly no effect. Widely used, widely misunderstood, and never sufficient on its own to make a decision.
Definition: The probability of observing your data (or something more extreme) if there were truly no effect. Widely used, widely misunderstood, and never sufficient on its own to make a decision.
A p-value tells you how likely your observed result would be if the null hypothesis were true—that is, if there were no real difference or effect. A small p-value means the data would be surprising under the assumption of no effect.
What P < 0.05 Actually Means
If p = 0.03, there is a 3% chance of seeing a result this extreme assuming the null hypothesis is true. It does not mean:
- There is a 97% chance the effect is real
- There is a 3% chance you are wrong
- The effect is large or practically meaningful
The 0.05 threshold is a convention, not a natural law. Fisher originally proposed it as a rough guide, not a binary decision rule.
Why P-Values Are Not Enough
- Large samples detect trivial effects: With enough participants, you can get p < 0.001 for a difference so small it has no practical impact
- Small samples miss real effects: A genuine improvement might produce p = 0.12 simply because you did not have enough data
- P-values say nothing about magnitude: They tell you whether an effect exists, not whether it matters
What to Do Instead
Report effect size alongside your p-value. A statistically significant result with a tiny effect size is not worth acting on. A non-significant result with a large effect size may warrant a larger study. Use confidence intervals to communicate the range of plausible values—far more informative than a single yes-or-no threshold.
Related Terms
Statistical Significance
A determination that an observed result is unlikely to have occurred by random chance alone. Conventionally indicated by a p-value below 0.05, meaning less than 5% probability of the result being a fluke.
Effect Size
A measure of the magnitude of a finding—how big the difference is between conditions, not just whether it exists. Essential for determining practical significance beyond statistical significance.
Sample Size
The number of participants in a research study. Appropriate sample size depends on research goals, method type (qualitative vs. quantitative), the precision required, and the number of distinct user segments being studied.