Eli Lilly Partners with insitro on Machine Learning Models for Small Molecule Discovery
insitro has entered a new collaboration with Eli Lilly to develop machine learning models capable of predicting pharmacological properties of small molecules, including their in vivo behavior. The models will be trained on Lilly’s proprietary dataset covering absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties gathered over decades of drug discovery programs. The aim is to reduce costly and time-intensive experimentation, improving hit-to-lead and lead optimization efforts.
insitro, founded in South San Francisco in 2018, has raised over $700 million and is developing a pipeline in metabolic disease and neuroscience. Its broader ChemML platform integrates computational chemistry and AI with experimental data to improve molecular design, while its clinical programs aim to apply AI-driven models in patient stratification and trial optimization.
The collaboration builds on an earlier 2024 agreement focused on siRNA delivery and antibody discovery. In this new effort, insitro will incorporate Lilly’s dataset into its ChemML platform, which integrates machine learning and physics-based in silico screening with proprietary DNA-encoded libraries and active learning medicinal chemistry.
The initiative will run through Lilly TuneLab, a newly launched platform that provides external biotechs access to AI models trained on more than $1 billion worth of Lilly’s preclinical, safety, and molecular data. Built on federated learning, TuneLab allows partners to benefit from and contribute to Lilly’s models without exposing raw datasets. It is part of the company’s Catalyze360 initiative, which also includes venture investment, shared lab space, and R&D collaboration.
insitro’s contribution will add advanced ADMET prediction capabilities to this ecosystem, expanding TuneLab’s use cases in compound triage, lead selection, and risk assessment. By combining Lilly’s datasets with insitro’s machine learning expertise, the effort is designed to accelerate the design of compounds with favorable pharmacokinetic profiles and reduce reliance on in vivo studies.
insitro describes its approach as a “pipeline through platform,” in which multimodal cellular and clinical data are integrated with genetics and machine learning to identify causal disease mechanisms and generate high-confidence drug targets. The modular platform is designed to be reusable across therapeutic areas, producing both in-house and partnered programs while iteratively improving models over time.
Topics: AI & Digital