
Introduction
In Part I of this series, we introduced the idea that drug candidates emerging from artificial intelligence may have to be valued differently than those emerging from more traditional approaches to drug discovery.
Part II extends this discussion and focuses on the many complex IP issues that emerge when AI enters the drug discovery room. We also pose questions about how these IP issues can influence valuation in ways traditional drug discovery does not.
Part III enters into the murky world of AI’s contribution to drug candidate valuation. We asked questions (but did not answer them) about licensing candidates versus providing access to the underlying AI platform.
Part IV extends the Part III discussion and tackles the Platform versus Candidate question, and the impact on valuation.
We now bring this series to a conclusion by discussing the data quality issue and AI drug discovery.
Data Quality and Validation Standards
A challenge unique to AI-discovered candidates is that their quality (or, novelty, if you prefer) depends fundamentally on the training data and computational validation.
This creates valuation uncertainty that simply doesn’t exist with traditionally-discovered molecules, where the discovery provenance is less dependent on underlying datasets.
This matters for licensing because data quality determines model reliability. An AI platform trained on high-quality, diverse, well-curated biological data may generate candidates with genuinely superior properties.
But a platform trained on limited, biased, or low-quality data may produce candidates that look promising computationally but fail in wet-lab validation or clinical development.
As legal experts note, every individual use of AI in drug development comes with distinct sets of benefits, risks, and challenges based on the specific data and methods employed.
The GSK-Noetik deal provides an instructive example of how data quality is being explicitly valued. Noetik’s platform is built on what it describes as “the largest proprietary multimodal oncology dataset of its kind, uniquely integrating spatial biology.”
GSK isn’t just licensing the AI models. It’s paying for access to the underlying high-quality training data that makes those models valuable.
The deal explicitly includes collaboration to “generate bespoke human spatial datasets,” recognizing that the data are as important as the computational algorithms.
This creates a tremendous due diligence challenge for licensees.
When evaluating an AI-discovered candidate, business development teams need to assess not just the molecule’s preclinical profile, but also the quality of the discovery process itself.
Key questions may include:
- What data was the AI model trained on?
- How diverse and representative is that training set?
- Has the model been validated on data it wasn’t trained on?
- What percentage of the platform’s computational predictions have been experimentally validated, and what was the success rate?
- Were any failed predictions excluded from reported metrics?
Unfortunately, many of these questions don’t have standard answers yet, and many AI platform companies consider their training data and validation metrics proprietary (which is the correct stance, in our view).
However, this information asymmetry creates risk for Licensees.
A candidate might look excellent in the Licensor’s computational models and even in their internal validation experiments, but if the underlying training data was narrow or the validation standards were lax, Preclinical and Clinical performance could disappoint.
As one analysis notes, firms should take extra caution to ensure appropriate consents for current and future uses of any data subject to privacy obligations, and should provide or seek similarly robust standards around data usage, security, and transparency.
In licensing negotiations, this translates to specific contractual provisions, such as representations about training data quality, access to validation study results, and potentially milestone adjustments if computational predictions don’t translate to experimental success.
There’s also the critical question of computational validation versus wet-lab validation.
Some AI platforms generate thousands of candidate molecules computationally, select the top predictions, synthesize a small subset, and report success rates based only on the synthesized compounds.
That’s fine, but it can create misleading metrics if the selection criteria weren’t rigorous. Thus, Licensees should insist on understanding the full funnel, in terms of the number of candidates generated, what filters were applied, how many were synthesized, and what the actual success rate was across all tested compounds.
Not surprisingly. the regulatory landscape adds further complexity.
The FDA’s January 2025 guidance on AI in drug development takes a risk-based approach based on “context of use,” or how the AI is deployed and what decisions it influences.
Low-risk uses like hypothesis generation require minimal documentation, but if AI predictions directly support regulatory decision-making without traditional validation, much more extensive documentation of data quality and model validation is required.
This means AI-discovered candidates may face different regulatory burdens depending on how much the sponsor relies on computational predictions versus traditional experimental validation.
For BD&L teams, the practical implication is that data quality and validation standards should be explicit factors in AI asset valuation.
Two AI-discovered candidates in the same therapeutic area with similar Preclinical profiles might warrant different deal terms if one comes from a platform with demonstrated validation track record and high-quality training data, while the other comes from a platform with limited transparency about its data sources and validation methods.
Until industry standards emerge for disclosing training data quality and validation metrics, licensees should approach AI asset valuations with appropriate skepticism.
The computational provenance creates additional risk that should be reflected in deal economics, whether through lower upfront payments, more aggressive milestone gating tied to experimental validation, or contractual provisions that allow renegotiation if AI predictions prove unreliable.
Investors, take note…
The premium for “AI-discovered” should only apply when the underlying data quality and validation standards justify confidence in the discovery process.
How AI companies justify confidence remains an open question.
Key Takeaways
- AI candidate quality depends on training data quality and computational validation, which creates valuation uncertainty absent in traditional discovery
- Licensee due diligence should assess not just molecule profiles but also data diversity, validation methodology, and experimental success rates
- Information asymmetry around proprietary training data and validation metrics creates risk that should be addressed through contractual provisions (i.e., lower Upfront payments)
- Understanding the full computational funnel (candidates generated, filters applied, synthesis rates, actual success rates) is critical to assessing model performance
- Data quality should be an explicit valuation factor. Candidates from platforms with demonstrated validation track records may warrant different economics than those with limited transparency
That’s It?
Probably not.
It is unlikely that we uncovered all of the issues surrounding valuation of AI-generated drug candidates. Indeed, this is an issue that will only become more prominent in the years months ahead.
NB: LLMs were used for some of the research aspects of this post.