In the domain of drug discovery, there can be a world of difference between a computer-generated hit compound, which is predicted to bind well to a drug target and what can be reliably synthesized at scale, or indeed synthesized at all. This discrepancy has been a lingering point of discord between the Discovery and R&D efforts in the chemical industry. Computer-aided drug design (CADD) has become an increasingly valuable tool by providing essential screening data and unique insight into drug action and mechanism, but it does not model the more complex world of chemical reactivity and synthetic chemistry.
Section: White Papers And Industry Reports View all sections
Generative AI models in chemistry are increasingly popular in the research community, mainly, due to their interest for drug discovery applications. They generate virtual molecules with desired chemical and biological properties (more details in this blog post).
However, this flourishing literature still lacks a unified benchmark. Such benchmark would provide a common framework to evaluate and compare different generative models. Moreover, it would help to formulate best practices for this emerging industry of ‘AI molecule generators’: how much training data is needed, for how long the model should be trained, and so on.
Increasing the success rate of high-throughput screening (HTS) and the quality of the resulting hits as well as developability of drug candidates is among the key challenges of small molecule drug discovery programs.
The above goals are associated with so called “compound quality” or “drug-likeness” of the starting small molecules. While the majority of considerations in this context is related to lead-like properties, physicochemical properties, diversity, effective coverage of the chemical space, privileged structures for drugs or structures possessing favourable physical properties or metabolic stability, it was shown in a recent analysis by F.W. Goldberg et al. (2014) that an effective and somewhat overlooked strategy to increase the compound quality is to focus closer on the choice of the building blocks used in the course of drug discovery programs. The nature of the selected building blocks determines not only the speed of the research but also the quality of the resulting drug candidates and their potential in further trials.
Obviously, some reagent classes are more popular than others due to historically synthetic routes adopted by organic and medicinal chemists.