PhaseFolio Validation Studies

Back-Test Library

Every backtest evaluates the production rNPV engine against a held-out cohort of historical drugs whose real-world fate is now known — using only information available before each drug’s decision point. No future data leaks into the model.3 cohorts are published; each leads with the strongest signal its sample can support.

3 published cohorts · 111 drugs · indication-specific FDA approval as the success criterion

Immunology

Rheumatoid Arthritis

0.625Pairwise AUC

16 drugs · Phase 2 entrants

Decision anchor: Phase 2 entry

Early calibration cohort. Directional signal at small n — Wilson 95% accuracy intervals span chance level, so it is read as direction, not confirmation.

DIRECTIONALView backtest →

Oncology

Non-Small-Cell Lung Cancer

0.709Pairwise AUC

59 drugs · 41 approved / 18 failed

Decision anchor: Phase 2 entry

Strongest discrimination signal in the published cohorts — 738 ranking pairs, clears the conventional ≥0.70 good-discrimination bar.

36 drugs · 25 approved / 11 not

Decision anchor: Phase 3 entry

Pre-Sprint-1 the engine did not discriminate (AUC 0.524). One cohort-validatable scored multiplier closed it to 0.629; the full ablation is published, not just the largest number.

PASSView backtest →

How the Back-Tests Work

One Method, Compared Across Cohorts

The same production engine, the same scoring discipline (pairwise AUC, Wilson-CI accuracy, risk-flag sensitivity), and one disclosed rule for the per-indication decision anchor: anchor at the earliest decision point at which the cohort’s failure population is observable in public registries. The methodology page carries the side-by-side cross-cohort comparison, the discrimination-vs- calibration framing, and the full antimicrobial Sprint-1 ablation.

Read the backtest methodology →