Skip to main content

How We Build Our Benchmarks

Methodology

Our benchmarks are calibrated against 2,600+ verified biopharma transactions sourced from regulatory filings, public disclosures, and proprietary intelligence. Here's how we turn raw data into actionable deal intelligence.

Data Foundation

Every benchmark in our platform is grounded in real, publicly disclosed transactions. Our database encompasses 2,600+ biopharma deals spanning 2017 through 2026, covering 12 therapeutic areas, 5 deal structures, and 15+ modalities.

2,600+
Verified Transactions
Licensing, acquisitions, collaborations, options, co-development
850+
Company Profiles
Pharma, biotech, and specialty companies tracked
12
Therapeutic Areas
Oncology through rare disease and women's health
10+
Data Sources
Regulatory filings, press wires, agency databases

Primary Sources

  • SEC Regulatory Filings — 8-K material definitive agreements, the filings companies are legally required to submit when entering material licensing, collaboration, or acquisition agreements
  • Press Release Wires — Deal announcements from global press distribution networks, filtered for biopharma relevance and financial term disclosure
  • Regulatory Agency Databases — FDA, EMA, and other regulatory body approval and authorization records that signal commercial-stage deal activity
  • Clinical Trial Registries — Partnership and sponsor changes in registered clinical trials that indicate underlying licensing or collaboration agreements
  • Proprietary Intelligence Feeds — Web-wide deal monitoring that captures announcements from industry conferences, investor presentations, and non-US company disclosures

Benchmarking Engine

Raw transaction data is transformed into actionable benchmarks through a multi-factor quantitative model that adjusts for the variables that matter most in deal structuring.

Multi-Factor Regression

Our hedonic regression model scores each comparable transaction across multiple dimensions — therapeutic area, modality, clinical phase, territory scope, deal structure, and indication specificity. Transactions are weighted by recency: deals from the last 24 months carry twice the influence of older transactions, ensuring benchmarks reflect current market dynamics rather than historical norms.

Monte Carlo Simulation

Every calculation runs 10,000 Monte Carlo iterations with scenario-weighted distributions — base case (50%), bear case (20%), and bull case (30%). This produces probability-adjusted ranges rather than single-point estimates, reflecting the inherent variability across different market conditions and negotiation outcomes.

Risk-Adjusted NPV

rNPV analysis incorporates phase-specific probability of success rates, indication-specific modifiers (biomarker validation, regulatory precedent, competitive density, modality risk), and scenario bridges that quantify dollar-impact risks like competitor entry, payer restrictions, and label expansion potential.

Semantic Deal Matching

Traditional deal databases match on keywords — "oncology" finds oncology deals. Our platform goes further with semantic matching technology that understands the full context of each transaction.

Every deal in our database is represented as a high-dimensional vector encoding its complete profile — companies, asset characteristics, modality, indication, development phase, territory, deal economics, and strategic context. When you run a calculation, your inputs are similarly encoded and compared against every transaction using cosine similarity.

This means an "oral GLP-1 receptor agonist for obesity at Phase 2" query will surface deals like Zealand/Roche (petrelintide), Carmot/Roche (CT-388), and Structure/Roche (GSBR-1290) — even if the exact keywords don't overlap. The system finds deals that are structurally and strategically similar, not just categorically related.

Data Quality & Freshness

Stale data produces misleading benchmarks. Our automated ingestion pipeline processes new regulatory filings and deal announcements multiple times per day, ensuring benchmarks reflect the latest market activity.

Confidence Threshold
Every transaction is scored for extraction confidence. Deals below 75/100 are excluded from benchmarks — we prioritize accuracy over volume.
Source Verification
Deals are cross-referenced against original source documents. Financial terms are only marked as disclosed when explicitly stated in filings or press releases.
Continuous Updates
Our pipeline ingests from 10+ data sources on automated schedules — some hourly, some daily, some weekly — depending on source update frequency.
Deduplication
Multi-key conflict resolution prevents the same deal from appearing twice, even when announced via different sources or amended in subsequent filings.

Understanding Benchmark Ranges

Biopharma deal terms are not deterministic — they are the product of negotiation between parties with different leverage, information, and strategic objectives. Our benchmarks reflect this reality by providing ranges derived from the distribution of comparable transactions, not single-point predictions.

The ranges you see represent where similar deals have historically landed across different market conditions, competitive landscapes, and negotiation dynamics. They are designed to inform your deal strategy and provide data-driven anchor points for term sheet discussions — not to predict the exact outcome of any individual negotiation.

For definitive deal structuring, we recommend engaging qualified financial and legal advisors who can incorporate proprietary clinical data, specific IP considerations, and counterparty dynamics that quantitative models cannot fully capture.