Methodology · JHTV Deep-Dive Production

JHTV Deep-Dive Production

How the top-10 asset deep-dives in the JHTV public-data case study were produced. The rubric ranking, rNPV envelopes, Monte Carlo distributions, and dossier exports come from the PhaseFolio engine and are reproducible by any user. The asset-level workups — mechanism plausibility, comparator status, indication–target sanity, regulatory pathway, investment thesis — were generated by dedicated agents running Claude Opus 4.7 (Anthropic flagship reasoning model, maximum thinking effort, 1M-token context) against ClinicalTrials.gov, FDA Drugs@FDA, and primary literature. No credentialed MD or PhD subject-matter expert reviewed the final output. Every finding is stated against named, citable sources and is independently verifiable.

Model	Claude Opus 4.7 (Anthropic flagship reasoning)
Thinking effort	Maximum
Context window	1M tokens
Agent architecture	One agent per asset, parallel dispatch
Sources queried	ClinicalTrials.gov, FDA Drugs@FDA, primary literature, patent filings
MD / PhD review	None
Audit pass	Parallel Claude Opus 4.7 agents, one per asset, CMO-lens prompt
Audit findings	Material findings on 8 of 10 top-10 assets
Verification model	Source-anchored — every finding traces to a citable identifier

Two-tier scope

What is engine output and what is AI-augmented analysis.

The JHTV public-data case study has two layers, produced by different parts of the PhaseFolio system. Both ship publicly; this page draws the line between them so a reader knows what is engine output versus what is AI-augmented analysis.

Tier 1 — Engine output (self-serve, reproducible). The full ranking of all ~602 inventions, the three-dimensional rubric scores, the rNPV envelopes for the top 10, the Monte Carlo distributions, the tornado sensitivity tables, and the dossier export format are produced by the PhaseFolio engine. Any user with their own portfolio could run the same pipeline and get the same shape of result. The rubric methodology, version-pinned at methodology@2026-05-07 with rubric weights locked at 40% clinical relevance / 30% modality fit / 30% whitespace, is documented separately at /methodology/jhtv-portfolio.

Tier 2 — AI-augmented deep-dives (assisted, not yet self-serve). Each top-10 asset received a dedicated workup covering mechanism plausibility (does the modality reach the target?), comparator status verification (are cited competitors active, discontinued, or stalled?), indication–target sanity (does the indication mapping match the actual target/mechanism class?), trial-design feasibility, regulatory pathway plausibility, and a written investment thesis. These were produced by AI agents, not by the engine; the orchestration is not yet a one-click feature in the product.

The model

Claude Opus 4.7, maximum thinking effort, 1M-token context window.

Each per-asset agent ran Claude Opus 4.7 — Anthropic's flagship reasoning model — at maximum thinking effort with the 1M-token context window. Maximum thinking effort means each response uses the most extended internal reasoning Anthropic exposes for the model; the 1M-token context allows full ingestion of the asset's source material (JHU listing, related ClinicalTrials.gov records, FDA filings where applicable, primary literature) inside a single agent context without summarization compression.

No fine-tuning was applied. No proprietary biology knowledge graph was consulted. The agents have access to the model's pretraining knowledge plus the structured prompt and the source material retrieved via web search and registry queries during the agent's execution.

Agent architecture

One agent per asset, dispatched in parallel, scoped by structured prompt.

One agent per asset, dispatched in parallel. Each agent received a self-contained prompt scoped to a single invention. The prompt instructed the agent to:

Verify mechanism plausibility against the asset's stated target and modality (e.g., “does an IgG mAb reach an intracellular target?”, “does an 8.5 kb CDS fit AAV's 4.7 kb capacity?”, “is ‘personalized allogeneic’ cell therapy biologically coherent?”).
Verify comparator status by querying ClinicalTrials.gov for active/terminated trials and ingesting recent press releases, then explicitly stating each comparator's status as of the build date (e.g., catching that VX-522 was discontinued in May 2026).
Check indication–target sanity against the actual mechanism class (e.g., POLθ inhibitors typically don't belong in lung; CFTR isn't fixed by a mAb; vectored immunoprophylaxis is gene-therapy-delivered antibody, not a conventional mAb).
Assess trial-design feasibility against patient population sizing and endpoint precedent for the asset's indication.
Assess regulatory pathway plausibility (orphan designation, accelerated approval, breakthrough, RMAT) against the actual indication and target population.
Synthesize an investment thesis anchored to the engine's calibrated rubric weights and rNPV envelope, so the prose cannot drift from the underlying methodology.

Each finding is stated against a named, citable source — a ClinicalTrials.gov NCT ID, an FDA application number, a published paper with DOI, or a press release URL — that a reader can resolve independently.

What was not done

Disclosed scope limits — no MD review, no proprietary knowledge graph, no claim of comprehensiveness.

No credentialed MD or PhD reviewed the final output. The PhaseFolio team is not staffed with biology or clinical expertise. The case study is published under the explicit assumption that any reader applying CMO-grade scrutiny will verify findings against the cited sources rather than relying on a credentialed reviewer's signature.
No human override of agent findings. Where the audit pass surfaced issues, the fix was a re-run of the agent with corrected scope or a documented note in the asset prose, not a manual rewrite of the agent's analysis.
No proprietary biology knowledge graph. The agents do not consult any internal knowledge graph beyond what the model retrieves via web search and registry queries during execution.
No claim of comprehensiveness. The deep-dives surface what the agents found. Mechanisms, comparators, or regulatory paths the agents missed are not flagged as missing — they are simply absent. This is a known limitation of any LLM-driven analysis and is the principal reason every finding is anchored to a citable source rather than a credentialed assertion.

Verifiability

The credibility argument runs through the sources, not the reviewer.

Every load-bearing claim — a target's subcellular localization, a competitor program's discontinuation, an asset's modality classification, a trial's enrollment count, an FDA approval pathway — is stated with a named source the reader can resolve. ClinicalTrials.gov NCT IDs are stable identifiers; FDA application numbers are queryable in Drugs@FDA; published papers carry DOIs; press releases carry URLs.

This is the substrate posture in plain terms: do not trust the analysis; verify it. A reader who finds a claim suspicious can follow the citation in seconds. A reader who finds the analysis convincing has the same audit chain available to defend the conclusion to a partner or LP.

Audit pass

Hyper-critical CMO-lens audit — material findings on 8 of 10 top-10 assets.

Before publication, the deep-dives were subjected to a hyper-critical CMO-lens audit pass — itself executed by parallel Claude Opus 4.7 agents, one per asset, with a self-contained prompt instructing each agent to interrogate biology, comparators, regulatory pathway, and rNPV inputs as a CMO advisor at a biotech VC would in the first five minutes of review.

The audit surfaced material findings on 8 of 10 top-10 assets, including: an intracellular target paired with an antibody modality that cannot reach it, a gene-therapy CDS exceeding AAV vector capacity, a malaria asset misclassified as a conventional mAb when the mechanism is vectored immunoprophylaxis, and stale comparator status (VX-522 cited as active when it had been discontinued). Each finding was either fixed in the asset's deep-dive prose or explicitly disclosed as a deferred limitation. The audit findings log is being assembled for separate publication.

Roadmap

What we are productizing next.

The deep-dive workflow is on the roadmap to be productized. The four capability components, in order of expected productization:

Comparator verification. ClinicalTrials.gov + press release ingestion + LLM-driven status synthesis. Highest signal-to-noise; partially automatable today via the existing PhaseFolio data pipeline.
Mechanism plausibility audit. Structured agent prompts running extended-thinking-mode reasoning against ClinicalTrials.gov, FDA records, primary literature, and patent filings to flag target accessibility, modality–target mismatches, and payload-size constraints.
Trial-design feasibility check. Agents matching patient population sizing and endpoint precedent against the asset's indication and modality class, surfacing feasibility gaps.
Asset-level investment thesis prose. Templated synthesis from the structured outputs above, anchored to the calibrated rubric weights so prose cannot drift from the underlying methodology.

Until each component ships as a self-serve product feature, the deep-dive layer is delivered with PhaseFolio orchestration in the loop. Pilot engagements where a user wants the deep-dive layer applied to their own portfolio are run as design partnerships; the productization roadmap is informed by what those partnerships surface.

§ Related Methodology & Sources