Data is king in modern biopharmaceutical research, and ability to generate massive amounts of quality biomedical data represents a tremendous research and business potential in the artificial intelligence (AI)-driven drug discovery realm. Challenges associated with big biological data, such as poor reproducibility, low accessibility, low standardization, etc, represent a considerable bottleneck for the advent of AI in drug discovery at scale, and the ambitions of the industry leaders to shift from drug discovery as a largely artisanship process to a so called "industrialized" drug discovery.
AI-driven companeis demonstrate progress in drug discovery
There is a growing wave of companies building drug design platforms of new generation -- Recursion Pharmaceuticals (NASDAQ: RXRX), Insitro, Exscientia (NASDAQ: EXAI) , Insilico Medicine, Deep Genomics, Valo Health, Relay Therapeutics (NASDAQ: RLAY), you name it -- companies that create highly integrated and automated AI-driven and data-centric drug design processes from biology modeling and target discovery, all the way to lead generation and optimization (sometimes referred to as “end-to-end” platforms). These “digital biotechs” are trying to transform traditional drug discovery, a notoriously bespoke, artisan process, into a more streamlined, repeatable, data-driven process -- more resembling an industrial conveyor line for drug candidates. Announcements by Exscientia (NASDAQ: EXAI) (here), Deep Genomics (here), Insilico Medicine (here), and other companies point to a situation where the average time for an entire preclinical program -- from building disease hypothesis to official nomination of a preclinical drug candidate -- have shrunk down to timelines as short as 11-18 months, and at fraction of costs of a typical project of similar nature conducted “traditionally”. Rapid timelines are achieved in drug repurposing programs with previously known drugs or drug candidates, for example, using AI-generated knowledge graphs, e.g. BenevolentAI (AMS: BAI) in their Baricitinib program, or advanced multiomics analysis and network biology to derive precision biomarkers for better patient stratification and matching novel indications -- as Lantern Pharm (NASDAQ: LTRN) does to rapidly expand their clinical pipeline.
However, a lot of those AI-driven “digital biotechs” are still relying on community-generated data to train machine learning models, and this may come as a limiting factor. While some of the leading players in the new wave, such as Recursion Pharmaceuticals and Insitro, are investing heavily into their own high-throughput lab facilities to get unique biology data at scale, other companies appear to be more focused on algorithms and building AI systems using data from elsewhere, and only having limited in-house capabilities to run experiments.
Data generation is a bottleneck in AI-driven drug discovery
A common practice is to use community-generated, publicly available data. But it comes with a caveat: an overwhelming majority of published data may be biased or even poorly reproducible. It also lacks standardization -- conditions of the experimentation may differ, leading to a substantial variation in data obtained by different research labs or companies. A lot has been written about it, and a decent summary of the topic was published in Nature: “The reproducibility crisis in the age of digital medicine”. For instance, one company reported that their in-house target validation effort stumbled at their inability to reproduce published data in several research fields. The obtained in-house results were consistent with published results for only 20-25% of 67 target validation projects that were analyzed, according to the company’s report. There are numerous other reports citing poor reproducibility of experimental biomedical data.
This brings us to a known bottleneck of “industrializing drug discovery”: the necessity for large amounts of high quality data, highly contextualized, properly annotated biological data that would be representative of the underlying biological processes and properties of cells and tissues.
In order for a wide-scale industrialization of drug discovery to occur, the crucial thing is the emergence of widely adopted global industrial standards for data generation and validation -- and the emergence of the ecosystem of organizations which would be “producing” vast amounts of novel data following such standards. Then, large drug makers and smaller companies would be able to adopt AI technologies to a much deeper extent. If we take the automotive industry as an example, a component of, say, an engine, developed in one part of the world would often fit into a technological process line in the other part of the world. So, highly integrated processes can be built across geographies and companies, as a “plug-and-play” paradigm.
Same approach is required in the preclinical research in drug discovery: every lab experiment, every data generation process, every dataset generated, all must be “compatible” with all other research processes, machine learning pipelines, etc. -- across the pharmaceutical and biotech communities globally. When this tectonic shift occurs, we will witness a truly exponential change in the performance of the pharmaceutical industry, something I would call “commoditization” of preclinical research.
These companies enable automated drug discovery
There is, luckily, a growing number of companies that are starting to bring about the required change in how preclinical research is done. Companies that build standardized, highly automated, scalable, and increasingly compatible laboratory facilities, guided by AI-based experiment control systems, and supplemented by AI-driven data mining and analytics capabilities. Such “next gen” lab facilities are often available remotely, making preclinical experimentation more accessible to various players in a wider scope of geographies.
In this post, let’s review several such companies, which offer various options, including “plug-and-play” experimentation services to drug discovery and biotech organizations.
Strateos is a California-based company, formerly known as Transcriptic and founded in 2012, provides cloud-based automation for daily routine operations of synthetic biology, medicinal chemistry, and a closed cycle of design-synthesis-testing for potential drug candidates. Having acquired 3Scan, it has expanded its spectrum of operations by robotic tissue slicing and analysis using computer vision technologies. In 2020 they started a collaboration with Eli Lilly where they are using Strateos Robotic Cloud platform at the client’s facilities to increase biology capabilities and implement an automated chemical synthesis loop.
The company raised a total of $101.8 million from a number of investors, including Lux Capital, DCVC, and Black Diamond Ventures.
Recently, Strateous annouced it is pivoting its focus to meet the rising demand for on-site, fully automated cloud labs for life science research and automated drug discovery. The company's LodeStar™ software platform allows companies to manage their on-premises research operations and instruments, enabling remote control, automated drug discovery data generation, and data analytics. Strateos recently completed multimillion-dollar design programs with two top biopharmaceutical companies to accelerate their digital transformation. The strategic shift also involves a reorganization of the company's staffing structure and the establishment of a Project Management Office. This move aims to support the growing demand for laboratory modernization and automated drug discovery solutions.
Another US-based company, the Emerald Cloud Lab founded in 2010, takes a different approach and instead of utilizing a set of predefined workflows provides a broad range of scientific instrumentations and therefore the ability to design fully customizable life science experiments. They are constantly adding new types of operations and machinery offering a wide and flexible set of services to their clients. One of the latest news is a collaboration with Carnegie Mellon University to be using Emerald Cloud Lab for educational and scientific purposes.
The company raised a total of $92.1 million from a number of investors, including Schooner Capital, Founders Fund, and Alumni Ventures.
The company is also active in democratizing academic research. For instance, Carnegie Mellon University is opening its Cloud Lab in early July, aiming to reshape the field of automated drug discovery. The facility, enabled by Emerald Cloud Labs, will provide students and faculty with remote access to over 200 lab instruments. Researchers can design AI-assisted experiments from any location, while technicians and robots carry out the work on-site.
The Cloud Lab democratizes academic science by allowing researchers from under-resourced institutions to conduct complex experiments with just an internet connection. Centralizing experiments in the cloud lab streamlines the process, reducing costs and the likelihood of errors during experiment replication. Carnegie Mellon is the first academic institution to implement cloud lab technology in collaboration with Emerald Cloud Labs.
Founded in 2016, San-Francisco-based Culture Biosciences is a company that is involved in scaling up and optimizing bioreactor experiments which accounts for a substantial overhead for biopharma companies. Culture Biosciences has designed a set of bioreactors specifically suitable for fine-tuning the processes and remote real-time monitoring. Along with a wide range of strain screening and process development capabilities, this would allow a quick transfer from a lab-scale to commercial production both for small biotech companies as well as larger pharmaceutical organizations.
The company raised a total of $101.6 million from a number of investors, including Northpond Ventures, Verily, and Cultivian Sandbox Ventures.
In 2023, Culture Biosciences announced a shift of its focus towards upstream bioprocess development of new therapeutics in automated drug discovery. The company has appointed a new leadership team with expertise in the biotech and biopharma industries to enhance its capabilities. This change aims to address the bottleneck in developing scalable and optimized manufacturing processes for new biologic therapies. The new team members include Elena Cant as Chief Operating and Commercial Officer, Babu Sivaraman as Vice President of Engineering and Product, Sumeet Agrawal as Vice President of Strategy and Finance, and Wayne Evans as Vice President of People. The expanded offerings will help more companies in the sector succeed in accelerating the development of scalable, optimized manufacturing processes.
In the growing field of automated drug discovery, a user-friendly solution is needed to integrate multiple robotic devices, data collection, and visualization routines from various vendors. UK-based Synthace, founded in 2011, tackles this challenge with its cloud-based "no-code" Synthace Life Sciences R&D Cloud for experiment automation. This software is suitable for simple sample liquid handling and complex multi-step protocols, using a graphical interface without requiring coding skills. The device-agnostic protocol builder enables users to create sophisticated experiment routines and transfer them between multiple devices. The in silico simulation feature helps identify potential errors in future workflows before actual experiment runs.
Synthace has raised $81 million from investors such as Horizons Ventures, Sofinnova Partners, and SOSV.
Topics: Industry Trends