PhaseFolio — Drug-Specific Predictive Signals in Oncology: What Held and What Didn't

Drug-Specific Predictive Signals in Oncology: What Held and What Didn't

We tested two drug-specific signals against a held-out cohort of oncology programs whose outcomes are now known. Biomarker quality held and is scored in the engine; early-phase ORR magnitude — a signal several vendors market — did not clear our bar and is not scored.

2026-05-28 · evaluations@2026-05-28 · 2 verdicts cited

What we validated

Biomarker quality — whether a program enrolls by a genomic-grade molecular alteration — is the drug-specific signal that held. On a held-out NSCLC cohort (N=85) it added 5.2 percentage points of AUC over the engine's structural baseline, and it is scored in the production engine today. The genomic-validated cohort odds ratio (5.59) ran well above the decade-old literature anchor (1.35); we ship the conservative anchor and disclose the overshoot rather than overfit to our own cohort.

What we tested and retired

We also tested early-phase objective response rate (ORR) magnitude — a signal several incumbents market as predictive. It did not clear our bar, and we never shipped it. A joint biomarker×ORR model beat biomarker-quality alone by only 0.5 points of AUC (not statistically distinguishable, p=0.48), and the test is unpowerable at any cohort we can assemble: detecting a 3-point gain would take roughly 830 drugs. The drugs with extractable early-phase ORR are overwhelmingly the genomic-validated successes biomarker quality already identifies, and failures rarely report it — so ORR carries little independent signal. We publish the negative result in full; the analysis and its inputs are detailed in the full report.

Scope & honesty

The biomarker-quality verdict is evidenced on NSCLC; its generalization to other solid tumors is under active validation and will be published as it completes. This is a transparent methods evaluation, not a peer-reviewed paper.

Signal Results

Biomarker quality

NSCLC · Phase II/III

Validated

+5.2pp held-out AUC over the structural baseline, stable across cohort sizes. The genomic_validated cohort odds ratio was 5.59 vs the Schwaederle (2016) literature anchor of 1.35 — we ship the lower anchor with the overshoot disclosed.

Decision: Scored: genomic_validated 1.35x / protein_only 0.85x / unselected 1.00x (log-odds, Phase II/III).

JSON →View in registry →

Phase 1 objective response rate (ORR magnitude)

Oncology solid tumor · Phase II/III

Not Predictive

A joint biomarker x ORR-bucket model beat the biomarker-only baseline by only +0.5pp held-out AUC (paired DeLong p=0.48), and the comparison is statistically unpowerable — detecting a +3pp gain at this baseline needs ~830 drugs; the cohort is 85. This followed the Phase 1 finding that the two signals combined fell below baseline (-0.3pp at 43-drug coverage) — the double-counting that motivated the joint-table test.

Decision: Not scored. Remains a non-scored, surfaced flag in engine 2.6.0. Published as a transparent negative result.