Eli Lilly Offers Access to AI Models Trained on $1B Worth of Proprietary Drug Discovery Data

In a rare move for the pharmaceutical industry, Eli Lilly and Company has launched TuneLab, a platform that gives external biotech companies access to its proprietary AI models for drug discovery. These models have been trained on decades of preclinical, safety, and molecular data—datasets the company estimates cost over $1 billion to generate.

Here’s what makes it technically interesting:

Built using federated learning, so no one’s raw data is ever exposed, not Lilly’s, not the partners’.
Enables bi-directional model improvement: participating biotechs can contribute their own data (securely), improving the ecosystem while benefiting from it.
Hosted externally by a third party to further reinforce data privacy.
Future versions will include in vivo small molecule prediction models, exclusive to TuneLab.

TuneLab is part of Lilly’s broader Catalyze360 initiative, which also includes venture capital (via Lilly Ventures), shared lab space (Gateway Labs), and R&D collaboration (ExploR&D). With this launch, Lilly adds industrial-grade machine learning infrastructure to its biotech-facing offerings, targeting startups that often lack access to large-scale proprietary datasets and computational models.

Eli Lilly building
Image credit: JHVEPhoto, iStock

Rather than sharing raw data, Lilly is providing access to models trained on its internal data through a secure, federated learning framework. This setup enables third-party biotech partners to use and improve the models without exchanging proprietary data. Model training and updates occur locally, with only encrypted model gradients shared, ensuring that IP and sensitive information remain protected on both sides.

TuneLab includes models trained on a variety of internal datasets, including ADME (Absorption, Distribution, Metabolism, and Excretion) profiling, toxicology and safety pharmacology studies, pharmacokinetics and pharmacodynamics (PK/PD), molecular property predictions for lead optimization, preclinical screening results across hundreds of thousands of compounds.

These models are intended to support early drug discovery workflows, including compound triage, lead selection, risk assessment, and optimization strategies.

According to Lilly, future iterations of TuneLab will include in vivo small molecule predictive models, available exclusively through the platform.

"Lilly has spent decades building comprehensive datasets for drug discovery," said Dr. Daniel Skovronsky, Chief Scientific Officer. "We're now making the intelligence trained on that data accessible to others in the ecosystem."

Participation in TuneLab is selective. Partner companies are invited to contribute their own experimental data, which is used to further improve the shared models while maintaining strict privacy. This continuous-learning loop allows the platform to evolve while benefiting the entire participant ecosystem.

Topic: Industry Movers

Eli Lilly