PhaseFolio
PhaseFolio Validation Study

Back-Test Results: NSCLC Drug Cohort

59 historical non-small-cell lung cancer drugs (Phase 2 entrants 1979–2024) evaluated against PhaseFolio's rNPV engine. Pairwise AUC of 0.709 across 738 ranking pairs (523 concordant) is the strongest discrimination signal in the published cohorts. Absolute calibration trails discrimination — an honest consequence of registry-survivor cohort construction, disclosed below.

2026-05-04 · 59 drugs (41 approved, 18 failed) · 10,000 MC iterations per drug
Pairwise AUC
0.709
target ≥0.60 · 523/738 concordant pairs
PASS
Phase-Controlled AUC
0.709
target ≥0.55 · controls for structural NDA/BLA advantage
PASS
Risk Flag Sensitivity
100%
18/18 failures flagged · target ≥70%
PASS
Separation Gap
4.7pp
successes 8.3% vs failures 3.6% · target ≥10pp
WATCH

Key finding: Pairwise AUC of 0.709 on 738 ranking pairs is the strongest discrimination signal in the published PhaseFolio cohorts — the engine ranks NSCLC successes above failures 70.9% of the time. Phase-controlled AUC matches at 0.709, ruling out structural NDA/BLA advantage. Risk-flag sensitivity hits 100% — every one of the 18 failed drugs carried at least one model-emitted risk flag at decision time.

How We Built the Dataset

Raw ClinicalTrials.gov data lacks the drug-level structure needed for decision-point reconstruction. The NSCLC enrichment pipeline transformed 5,167 raw NSCLC trials into a structured cohort with mechanism, target, FDA linkage, and outcome data.

1
Ingest Raw CT.gov Data
192,411 interventional studies ingested via ClinicalTrials.gov API. Linked condition mappings (420K rows) and intervention data (424K rows) stored in Supabase.
2
Filter for NSCLC
5,167 unique NSCLC trials identified by condition text matching across Phase 1 through Phase 4, with enrollment windows spanning 1979 to 2024.
3
Cross-Reference 4 Data Sources
Each trial enriched by AI agent cross-referencing: ClinicalTrials.gov (structured fields), FDA Drugs@FDA (regulatory data + approval dates), PubMed (published efficacy), and web search (press releases, analyst reports). Confidence score computed per trial.
4
Drug-Class & MoA Mapping
Pharmacology mapping per drug: drug_class (100% coverage), mechanism_of_action (57.8%), FDA application linkage (39.1%), modality, target. Seed CSV of 91 NSCLC drug entries anchored the canonical mapping.
5
Cohort Derivation (Approvals + Failures)
Approvals auto-derived from drug_commercial_profiles where NSCLC indication and FDA approval date are present (41 drugs). Failures manually curated from terminated Phase 2/3 NSCLC programs in enriched_trials (18 drugs). Combined cohort: 59 drugs.
6
Survivor Bias Verification
Completion-to-termination ratios compared between raw CT.gov NSCLC (5,167) and the enriched dataset. Survivorship gap ≤2.3pp at every clinical phase — confirming the enrichment process did not selectively retain successful trials.
5,167
NSCLC Trials
59
Cohort Drugs
41 / 18
Approved / Failed
738
Ranking Pairs
Drug Class Coverage100%
Mechanism of Action57.8%
FDA Application Linkage39.1%
Quantitative Efficacy Data5.3%

Survivor bias verified within ≤2.3pp at every phase: completion-to-termination ratios in the enriched 5,167-trial dataset match the raw CT.gov NSCLC corpus across Phase 1, 2, 3, and 4. The cohort itself remains a registry-survivor subset of the universe of all programs that ever entered Phase 2 — programs that died before public disclosure are unrepresented. This is a property of the source data, not the enrichment pass; it inflates observed approval rates upward in calibration plots independent of engine accuracy (see Limitations).

How the Back-Test Works

Each drug is evaluated using only information available before its real-world decision point. No future data leaks into the model.

1
Reconstruct Decision Point
For each drug, the decision date is the earliest Phase 3 NSCLC trial start minus 12 months (or fda_approval_date − 4 years where no Phase 3 trial registered). Defines the information frontier the model is allowed to see.
2
Apply BIO/QLS Base Rates
Phase-by-phase transition probabilities sourced from BIO/QLS 2021 oncology cohort (12,728+ stage transitions). Same base table used in production rNPV.
3
Apply Modifiers via Logistic Path
Genetic validation, biomarker strategy, orphan designation, first-in-class flags applied through the log-odds path to keep PoS bounded in [0,1]. Multipliers gated by source-publication date so post-decision evidence cannot leak.
4
Risk-Flag Emission
Generic risk flags (HIGH_COMPETITION, LIMITED_TRIAL_DATA, FIRST_IN_CLASS_RISK) emitted from enriched_trials counts and cohort metadata at the decision date. Class-specific NSCLC risk tables not yet populated.
5
Run rNPV Engine + Monte Carlo
10,000 iterations per drug with Bernoulli stage gates. Same production engine used by PhaseFolio customers. Per-drug output: predicted cumulative PoS, rNPV, eNPV, MC percentiles.
6
Score Against Actual Outcomes
Pairwise AUC over all 41×18 = 738 success/failure pairs. Phase-controlled AUC within Phase 2 (the only phase with both successes and failures in the cohort). Risk-flag sensitivity = fraction of failures flagged at decision.

Predicted Cumulative PoS by Drug

Bars show the model's predicted cumulative probability of success at the decision point, sorted within group. Top 12 of 41 approved + top 12 of 18 failed shown for readability; full 59-drug cohort table follows.

Approved — top 12 of 41
etoposideVePesid · Topoisomerase II inhibitor
33.4%
gemcitabineGemzar · Nucleoside analog chemotherapy
25.4%
docetaxelTaxotere · Taxane chemotherapy
21.0%
cisplatinPlatinol · Platinum chemotherapy
19.4%
carboplatinParaplatin · Platinum chemotherapy
19.4%
paclitaxelTaxol · Taxane chemotherapy
19.4%
osimertinibTagrisso · EGFR tyrosine kinase inhibitor
18.0%
erlotinibTarceva · EGFR tyrosine kinase inhibitor
15.7%
vinorelbineNavelbine · Vinca alkaloid chemotherapy
12.7%
bevacizumabAvastin · Anti-VEGF biologic
12.6%
pemetrexedAlimta · Antifolate chemotherapy
11.9%
gefitinibIressa · EGFR tyrosine kinase inhibitor
11.7%
Failed — top 12 of 18
mage-a3 vaccineMAGE-A3 ASCI · MAGE-A3 cancer vaccine
6.8%
figitumumabFigitumumab (CP-751,871) · Anti-IGF-1R antibody
5.4%
aflibercept (nsclc)Zaltrap (NSCLC arm) / approved Zaltrap CRC is separate · VEGF trap (recombinant fusion protein)
5.4%
cabiralizumabFPA008 · Anti-CSF1R antibody (TAM modulation)
5.2%
belagenpumatucel-lLucanix · TGF-beta2 antisense allogeneic tumor…
4.8%
veliparib (nsclc)Veliparib (NSCLC) · PARP inhibitor
4.3%
cixutumumabIMC-A12 · Anti-IGF-1R antibody
4.0%
dalotuzumabMK-0646 · Anti-IGF-1R antibody
4.0%
rociletinibCO-1686 · 3rd-gen EGFR T790M TKI
3.7%
selumetinib (nsclc)Selumetinib (NSCLC arm) / later Koselugo (different indication) · MEK1/2 inhibitor
3.3%
stimuvaxStimuvax (tecemotide / L-BLP25) · MUC1 cancer vaccine
3.1%
talactoferrinTalactoferrin alfa · Recombinant lactoferrin oral immunom…
3.1%
Mean PoS (approved): 8.3% · Mean PoS (failed): 3.6% · Separation: +4.7pp · Pairwise AUC: 0.709

59-Drug NSCLC Back-Test Cohort

DrugBrandMechanismOutcome
etoposideVePesidTopoisomerase II inhibitorApproved
gemcitabineGemzarNucleoside analog chemotherapyApproved
docetaxelTaxotereTaxane chemotherapyApproved
cisplatinPlatinolPlatinum chemotherapyApproved
carboplatinParaplatinPlatinum chemotherapyApproved
paclitaxelTaxolTaxane chemotherapyApproved
osimertinibTagrissoEGFR tyrosine kinase inhibitorApproved
erlotinibTarcevaEGFR tyrosine kinase inhibitorApproved
vinorelbineNavelbineVinca alkaloid chemotherapyApproved
bevacizumabAvastinAnti-VEGF biologicApproved
pemetrexedAlimtaAntifolate chemotherapyApproved
gefitinibIressaEGFR tyrosine kinase inhibitorApproved
mobocertinibExkivityEGFR exon 20 insertion inhibitorApproved
cemiplimabLibtayoAnti-PD-1 checkpoint inhibitorApproved
lorlatinibLorbrenaALK tyrosine kinase inhibitorApproved
brigatinibAlunbrigALK tyrosine kinase inhibitorApproved
selpercatinibRetevmoRET selective inhibitorApproved
capmatinibTabrectaMET tyrosine kinase inhibitorApproved
amivantamabRybrevantEGFR/MET bispecific antibodyApproved
tepotinibTepmetkoMET tyrosine kinase inhibitorApproved
entrectinibRozlytrekROS1/NTRK tyrosine kinase inhibitorApproved
trastuzumab deruxtecanEnhertuHER2-directed ADCApproved
larotrectinibVitrakviNTRK kinase inhibitorApproved
ipilimumabYervoyCTLA-4 checkpoint inhibitorApproved
pralsetinibGavretoRET selective inhibitorApproved
crizotinibXalkoriALK tyrosine kinase inhibitorApproved
adagrasibKrazatiKRAS G12C inhibitorApproved
dacomitinibVizimproPan-HER tyrosine kinase inhibitorApproved
sotorasibLumakrasKRAS G12C inhibitorApproved
tislelizumabTevimbraPD-1 checkpoint inhibitorApproved
afatinibGilotrifEGFR tyrosine kinase inhibitorApproved
ramucirumabCyramzaAnti-VEGFR2 monoclonal antibodyApproved
necitumumabPortrazzaAnti-EGFR monoclonal antibodyApproved
datopotamab deruxtecanDatrowayTROP2-directed ADCApproved
pembrolizumabKeytrudaAnti-PD-1 checkpoint inhibitorApproved
ceritinibZykadiaALK tyrosine kinase inhibitorApproved
atezolizumabTecentriqAnti-PD-L1 checkpoint inhibitorApproved
durvalumabImfinziAnti-PD-L1 checkpoint inhibitorApproved
alectinibAlecensaALK tyrosine kinase inhibitorApproved
nab-paclitaxelAbraxaneTaxane chemotherapy (albumin-bound)Approved
nivolumabOpdivoAnti-PD-1 checkpoint inhibitorApproved
mage-a3 vaccineMAGE-A3 ASCIMAGE-A3 cancer vaccineFailed (Ph 3)
figitumumabFigitumumab (CP-751,871)Anti-IGF-1R antibodyFailed (Ph 3)
aflibercept (nsclc)Zaltrap (NSCLC arm) / approved Zaltrap CRC is separateVEGF trap (recombinant fusion protein)Failed (Ph 3)
cabiralizumabFPA008Anti-CSF1R antibody (TAM modulation)Failed (Ph 2)
belagenpumatucel-lLucanixTGF-beta2 antisense allogeneic tumor…Failed (Ph 3)
veliparib (nsclc)Veliparib (NSCLC)PARP inhibitorFailed (Ph 3)
cixutumumabIMC-A12Anti-IGF-1R antibodyFailed (Ph 2)
dalotuzumabMK-0646Anti-IGF-1R antibodyFailed (Ph 2)
rociletinibCO-16863rd-gen EGFR T790M TKIFailed (Ph 2)
selumetinib (nsclc)Selumetinib (NSCLC arm) / later Koselugo (different indication)MEK1/2 inhibitorFailed (Ph 3)
stimuvaxStimuvax (tecemotide / L-BLP25)MUC1 cancer vaccineFailed (Ph 3)
talactoferrinTalactoferrin alfaRecombinant lactoferrin oral immunom…Failed (Ph 3)
demcizumabOMP-21M18Anti-DLL4 antibody (Notch pathway)Failed (Ph 2)
bavituximabBavituximab (PGN401)Anti-phosphatidylserine antibodyFailed (Ph 3)
custirsenOGX-011Clusterin antisense oligonucleotideFailed (Ph 3)
ganetespibSTA-9090HSP90 inhibitorFailed (Ph 3)
patritumabU3-1287 / patritumabAnti-HER3 antibodyFailed (Ph 2)
tergenpumatucel-lHyperAcute LungAllogeneic whole-cell vaccineFailed (Ph 2)

Deep Dives

Strongest No-Go Signal

Tergenpumatucel-L

Allogeneic whole-cell NSCLC vaccine · NewLink Genetics · Decision: 2009
PhaseFolio assigned the lowest cumulative PoS in the cohort (1%) with multiple risk flags reflecting first-in-class allogeneic-vaccine modality, limited prior precedent in NSCLC, and the small clinical footprint visible at decision time. Monte Carlo distribution skewed heavily negative.
Actual outcome: Phase 2 terminated for lack of overall-survival benefit; program discontinued. Vaccine modality has yet to deliver an NSCLC approval.
1%
Predicted PoS
Failed
Phase 2
0.709
Cohort AUC
Honest Limitation

Checkpoint Inhibitors Under-Predicted

Anti-PD-1 / PD-L1 modality class · Identified in this cohort
Nivolumab (Opdivo), atezolizumab (Tecentriq), and durvalumab (Imfinzi) each received a predicted cumulative PoS of 1–2% at their NSCLC Phase 2 decision points — placing them at the bottom of the ranking. All three approved. The model penalized them via the cumulative-LoA prior on the “biologic” modality bucket without giving credit for the structural shift checkpoint-inhibitor class success rates were producing in oncology.
We disclose this rather than hide it: pairwise AUC stays strong because most checkpoint-inhibitor approvals still rank above the model's correctly-flagged failures. Class-specific NSCLC modifier tables (analogous to RA's anti-TNF / JAK / IL-6 tables) are the planned correction.
1–2%
Predicted PoS
3 / 3
Approved (FDA)
Open
Modifier Table

Discrimination strong, calibration trails. Pairwise AUC of 0.709 validates that the engine ranks NSCLC successes above failures reliably. The separation gap (4.7pp between mean predicted PoS for successes vs failures) and the 0–15% calibration bucket (51 drugs predicted, 64.7% actual approval rate) reflect two distinct effects: (1) cohort survivor bias from registry-visible Phase 2 entrants inflates observed approval rates above the population base rate the engine is calibrated to, and (2) the checkpoint-inhibitor class shift documented in the case study. Class-specific NSCLC modifier tables and cohort expansion to registry-invisible Phase 2 programs are the planned next steps. See the backtest methodology for full discrimination-vs-calibration framing.