Statistical Significance and P-Values in Clinical Trials
What a p-value actually means, why p<0.05 is the bar, the difference between statistical and clinical significance, and how to avoid being fooled by a 'trend.'
Why P-Values Decide Outcomes
When a Phase 3 trial reads out, the headline often hinges on a single number: the p-value on the primary endpoint. Understanding what that number means — and what it does not — separates investors who can read a readout from those who react to spin.
A p-value answers one narrow question: if the drug truly had no effect, how likely would we be to see a result at least this extreme by chance alone? A p-value of 0.03 means there's roughly a 3% probability of seeing this result if the drug did nothing.
The 0.05 Threshold
By convention, a result is "statistically significant" if the p-value is below 0.05. This threshold is what trials are powered around: the FDA generally expects pre-specified primary endpoints to clear it.
A few critical clarifications:
- p < 0.05 is a yes/no gate, not a quality score. A drug that hits p = 0.049 met its endpoint; one at p = 0.051 did not. The biology may be similar, but the regulatory verdict differs.
- A "trend toward significance" (e.g., p = 0.08) is a miss. Companies sometimes describe near-misses this way. In confirmatory testing, close doesn't count.
- Two positive trials are often expected. For many indications the FDA wants substantial evidence — frequently two adequate and well-controlled studies, or one very robust study — not a single borderline result.
Statistical Significance ≠ Clinical Significance
This is the most important distinction and the one investors most often miss. A result can be statistically significant yet clinically trivial. If a trial is large enough, even a tiny difference can clear p < 0.05.
So always ask two questions in sequence:
- Is it statistically significant? Did the primary endpoint hit p < 0.05?
- Is the effect size clinically meaningful? Does the magnitude of benefit actually matter to patients and prescribers — enough to win a label and adoption?
A statistically significant but clinically marginal result can still struggle at an advisory committee or fail to gain commercial traction even after approval.
Confidence Intervals: The Underrated Number
Alongside the p-value, look at the confidence interval around the effect size. It tells you the plausible range of the true effect. A significant result with a wide confidence interval that nearly touches "no effect" is shakier than one with a tight interval comfortably away from zero. The confidence interval conveys precision in a way the p-value alone does not.
Common Ways to Be Fooled
- Subgroup mining. A failed overall trial with a "significant" subgroup is usually a false positive unless that subgroup was pre-specified and statistically protected.
- Multiple endpoints without correction. Testing many outcomes inflates the chance of a spurious "win." Pre-specified hierarchical testing guards against this.
- Post-hoc analyses. Analyses dreamed up after seeing the data are hypothesis-generating, not confirmatory.
Applying It
Before a Phase 3 readout, know the trial's statistical plan: the primary endpoint, the powering assumptions, and how many trials the FDA expects. After the readout, confirm the primary hit p < 0.05, check the effect size and confidence interval, and be skeptical of subgroup or secondary-endpoint spin when the primary missed.
A clean, pre-specified, statistically significant and clinically meaningful result is what carries a program toward its FDA decision. To follow upcoming readouts and the companies behind them, use the catalyst calendar and the relevant company page.
Track Biotech Catalysts in Real Time
BioSniper aggregates FDA, SEC, and clinical trial data with AI-powered multi-agent analysis.
Related Articles
The Biosimilar Approval Pathway: What Investors Need to Know
How biosimilars are approved via the 351(k) pathway, what interchangeability means, and why biosimilar competition reshapes biotech revenue and patent-cliff dynamics.
Biotech Valuation with rNPV: Pricing Pipelines Under Risk
How risk-adjusted net present value (rNPV) values a clinical-stage pipeline, why probability of success drives everything, and how to use it without fooling yourself.
Surrogate Endpoints in Oncology: ORR, PFS, and Overall Survival
How oncology surrogate endpoints like ORR and PFS relate to overall survival, why they enable faster approvals, and the risks investors must weigh when survival data lag.