Artificial Intelligence in Drug Discovery and Biotech: 2022 Recap and Key Trends

by Andrii Buvailo        Biopharma insight

Disclaimer: All opinions expressed by Contributors are their own and do not represent those of their employers, or
Contributors are fully responsible for assuring they own any required copyright for any content they submit to This website and its owners shall not be liable for neither information and content submitted for publication by Contributors, nor its accuracy.

   7536    Comments 1

Table of contents:


The advent of AI in drug discovery at a glance

The current advent of artificial intelligence (AI) is shaping the evolution of entire industries, including the pharmaceutical and biotech industries. Unsurprisingly, almost every large and small Life Science organization has shown keen interest in adopting AI-driven discovery platforms in the hope of streamlining R&D efforts, reducing discovery timelines and costs, and improving efficiency.

All of the largest pharmaceutical companies, such as J&J, GSK, AstraZeneca, Novartis, Pfizer, Sanofi, Eli Lilly, and others, have made significant investments in AI technology, including equity investments, acquisitions of, or partnerships with, AI-focused companies, building internal capabilities, or a combination of approaches. 

At the same time, there is a wave of new kinds of drug discovery and biotech companies built as AI-centric organizations, often from day one. Having been founded, for the most part, within the last decade, such companies have already built and tested specialized AI-driven drug discovery platforms -- often including dozens of machine learning models -- and now are starting to reap the rewards in the form of fast and cost-effective target discovery and drug design capabilities, yielding preclinical and clinical drug candidates in a fast manner. Below we will be discussing a cohort of AI-developed drug candidates -- small molecules, biologics, and other modalities -- which have already entered clinical trials or are about to do so. 

Other AI companies can model biology using complex multimodal data at scales not imaginable some twenty years ago. Yet another group of companies developed AI-driven platforms to boost operational efficiency and experiment design of clinical trials or real-world data analysis (e.g., pharmacovigilance). 

Big-tech companies, such as Alphabet, Microsoft, Amazon, IBM, and Tencent, which have competency and expertise in AI and big data technologies, are also making a foray into the drug discovery space -- by investing, founding startups, partnering with life science companies, experimenting, innovating… 

Finally, there is significant progress in other cutting-edge technologies -- quantum computing, Cryo-EM, DNA-encoded libraries, etc.-- which are converging with the artificial intelligence trend to output not only new types of tools, products, and services but also a wave of new startups and even novel business models. 


What is AI, and how can it boost drug research?

Artificial intelligence is a relatively old concept, formalized at a famous Dartmouth College conference in 1956. The AI technologies in drug discovery have evolved from earlier machine learning (ML), cheminformatics, and bioinformatics concepts and approaches. For example, the application of machine learning to developing quantitative structure-activity relationship (QSAR) models and expert systems for toxicity prediction has a long history. 

However, the rapid (in some cases -- “exponential”) advent of big data, advanced analytics, minimizing the cost of computation, GPU acceleration, cloud computing, algorithm development (e.g., deep neural nets and large language models), and the “democratization” of AI technology -- all led to a synergistic “boom” in commercializing and industrializing artificial intelligence, in particular, in the pharmaceutical and biotech industries. 

In this white paper, we use the collective term “artificial intelligence” to refer to any sophisticated computational and modeling systems which can automatically learn insights and derive practical suggestions from “big data,” structured and unstructured data, also multimodal data.

While there is no limit to a particular family of algorithms that we refer to as “artificial intelligence,” we, in most cases, imply various flavors of machine learning-based systems (primarily deep neural networks) and large natural language processing (NLP) models. Modern AI systems can learn without being explicitly instructed (in contrast to traditional cheminformatics software within “if-then” logic), they can improve accuracy after new learning cycles and when more data is fed to the system, and -- most notably -- they can process high dimensionality multimodal data of enormous size. All such attributes are what significantly differentiate modern-day artificial intelligence systems from legacy cheminformatics and bioinformatics software packages. Such abilities are at the center of what drives the ongoing excitement about AI (and hype).  

While some components of what we call “artificial intelligence” -- e.g., machine learning tools and language models -- are used by pretty much every pharmaceutical organization and academic lab, some companies managed to build sophisticated computational and modeling pipelines, research “AI platforms,” which include automated workflows across dozens and even hundreds of various models and systems (deep learning, language models), and hundreds of various public and proprietary data sources. 

The high sophistication and automation of some AI platforms led to their “commoditization” to the point they have trademarked commercial names. At the same time, some of them are offered as software-as-a-service to other companies. Examples include mRNA DESIGN STUDIO™ by Moderna, Centaur Chemist® by Exscientia, Guardian Angel™ by AI Therapeutics, ConVERGE™ by Verge Genomics, Taxonomy3® by C4X Discovery, and many others. 

Below is an example of Pharma.AI by Insilico Medicine, a modular system for end-to-end drug discovery that comprises hundreds of different sub-systems and machine learning models -- altogether controlled by yet other algorithms of higher modeling abstraction (via a principle of “ensemble learning”). platform

A scheme of Pharma.AI end-to-end platform. Image credit:


Artificial intelligence is widely used in almost every aspect of pharmaceutical research, from data mining, biology modeling, and target discovery to lead identification and preclinical and clinical research. It is also used for synthesis planning, intelligent search for reagents and research consumables, and auxiliary tasks such as smart laboratory notebooks and virtual assistants. 



The Life Science ecosystem of AI adopters includes the following major categories of players: 

400+ AI-driven companies (startups/scaleups), offering a wide array of AI-driven platforms and services -- from classical Software as a Service model to custom data science services, drug discovery (“Drug candidate-as as service”), and clinical trial support/management resources.

Domain-specific software providers (e.g., KNIME, ChemAxon, Dotmatics, MolSoft, and others) primarily focus on cheminformatics/bioinformatics software but also provide machine learning-powered tools.

Top-tier pharmaceutical and biotech companies developing in-house AI expertise as part of their R&D strategy. Such players often collaborate with external AI vendors and AI-driven biotech startups to explore pilot programs in drug discovery/basic biology/clinical trial analytics.

Top-tier technology companies like Google, Amazon, and Tencent entering the pharmaceutical space, leveraging cutting-edge AI technologies and big data infrastructures.

Contract research organizations (CROs) developing expertise in AI to augment their value offering to pharma/biotech customers

Academic labs in pharma/biotech space, conducting AI research and developing specialized frameworks and tools relevant to the industry (usually a cradle for future AI startups/spin-outs)

Non-domain-specific software providers developing AI-as-a-service packages and models suitable for application in pharmaceutical research (e.g., “out of the box AI”)

Open-source machine learning tools and frameworks, widely exploited by life science professionals in their research projects


AI drug discovery investment landscape, 2022

After 2021, the anomalously successful year for the biotech industry in terms of the amount of venture capital deals, the record number of initial public offerings, an abundance of successful exits, and a generally very positive climate in the stock market, the year 2022 demonstrated significant cooling down of financial activity and outright poor performance of the stock market. 

However, artificial intelligence in the drug discovery sector demonstrated certain resilience, at least in the private equity transactions landscape, with several companies raising hundreds of millions in venture capital. Some examples include Beijing-based MegaRobo Technologies ($300 million Series C), Massachusetts-based ConcertAI ($150 million Series C) and Celsius Therapeutics ($83 million Series A), Hong Kong-based Insilico Medicine ($95 million Series D), California-based BigHat Biosciences ($75 million Series B) and DeepCell ($73 million Series B), and several others -- read “Major VC Rounds For AI Companies in Drug Discovery and Biotech in 2022”.      

A merger and acquisition (M&A) landscape was marked by a recent notable deal involving a biotech giant Ginkgo Bioworks acquiring Zymergen in a transaction valuing Zymergen at $300 million. The acquisition brings Zymergen’s machine learning and data science capabilities together with Ginkgo’s synthetic biology platform. 


Key industry observations and trends

The advent of AI and data technologies, as well as novel computational tools and infrastructural solutions (databases, cloud services, etc.), are all redefining the way the pharmaceutical industry is operating -- on research, clinical, and business levels. Below let us review some of the trends and observations in the AI for drug discovery space and illustrative industry developments in 2022.


AI-enabled biology modeling and target discovery

In drug discovery research, identifying novel drug targets is critical for developing novel first-in-class therapeutic drugs -- potential “blockbusters.” Drug discovery efforts over several past decades centered, traditionally, around targeting specific proteins with suitable “pockets” to be influenced by a ligand molecule (often, a small molecule). But out of the entirety of all human proteins (aka “proteome”), a small number of proteins were explored as targets. There are currently 20,360 human proteins in Swiss-Prot, of which approximately 4,600 are known to be involved in disease mechanisms according to the OMIM database, representing around 22% of human proteins with roles in disease. These proteins are the obvious region of the human proteome likely to contain viable drug targets. However, as of 2017, only around 890 human and pathogen-derived biomolecules (mostly proteins) were actually utilized by the existing FDA-approved drugs. These biomolecules included 667 human-genome-derived proteins targeted by drugs for human disease. Things are not much different today, so there is still a lot of room for identifying novel targets in this pool. Novel computational approaches based on artificial intelligence technologies allow for identifying new druggable protein pockets at scale, sometimes allowing for proteome-wide virtual screens. 

But what is even more exciting, advanced modeling tools help identify and modulate novel types of targets, such as protein-protein interactions, targets with large contact areas, protein-nucleic acid interactions, and next-generation targets, such as exploiting the cell’s protein degradation machinery. 

RELATED: Turbocharging Phenotypic Screening with AI to Target mRNA Biology: Interview with Yochi Slonim, co-founder of AnimaBiotech

A lot of AI-driven companies are focused on modeling biology, discovering and validating novel targets and offering “disease model-as-a-service” or “target-discovery-as-a-service” to other organizations. Demand for this kind of contract research services is rising which is reflected in the growing number of target discovery partnerships. 

For example, in September 2022, an Israeli-based biology modeling company CytoReason announced an expanded $110 million collaboration with Pfizer. The two companies started working together in 2019 when Pfizer started using CytoReason’s biological models in research aimed at developing new drugs for immune-mediated diseases and cancer immunotherapies. 

Continue reading

This content available exclusively for BPT Mebmers

Share this:              


  • Mike Herman 2023-01-26 13:04

    This is fascinating. I speak on patient advocacy and it's this type of development that gives cancer patients hope. I've had Multiple Myeloma for just about a decade and my life expectancy back then was 4 years.


Leave a Reply

Your email address will not be published. Required fields are marked *