A Clear Example of AI Value For Drug Discovery Has Arrived

by Andrii Buvailo, PhD          Biopharma insight / Featured Research

Disclaimer: All opinions expressed by Contributors are their own and do not represent those of their employers, or BiopharmaTrend.com.
Contributors are fully responsible for assuring they own any required copyright for any content they submit to BiopharmaTrend.com. This website and its owners shall not be liable for neither information and content submitted for publication by Contributors, nor its accuracy.

   6819    Comments 5
Topics: Emerging Technologies   
Share:   Share in LinkedIn  Share in Reddit  Share in X  Share in Hacker News  Share in Facebook  Send by email   |  

With all the hot discussions (for instance, here, here, here and here) going on right now among medicinal chemists, pharmaceutical researchers, and data scientists as to what artificial intelligence (AI) means for the future of drug discovery, the life science world has divided into “AI-believers”, “AI-atheists”, and “AI-agnostics”.

It is useless to repeat what has been many times said about successes of AI in areas like natural language processing, image processing, pattern recognition and self-driving cars (here is the summary), but few of us knew if those sort of results (or any meaningful results at all) could possibly be achieved with such complex systems as biological organisms… Finally, however, a hint of hope arrived.   

Just came across a fresh publication in Molecular Informatics which pushes the limits in understanding the kind of things that AI can really do already today for the medicinal chemistry and drug discovery. And it sounds cool indeed.

In their paper titled De Novo Design Of Bioactive Small Molecules By Artificial Intelligence, authors explained how they had trained AI-model, gained new bioactive molecules out of it, and then tested the molecules against targets in hybrid reporter gene assays to confirm 1 selective nM activity lead and a back-up, both new to the chemical record.

A brief summary of how they did it was provided in this Linkedin post by Alfred Ajami, I am just copying the text here:

1) Train RNN/LSTM on SMILES of 500k bioactives (< 1uM) from ChEMBL22.

2) Fine tune by transfer learning to enable de novo generation of structures: use 25 fatty acid mimetics with known agonistic activity on retinoid X receptors (RXR) and/or peroxisome proliferator-activated receptors (PPAR)."

3) From fine-tuned AI model, pick 1000 SMILES, by fragment growing from the minimalist start fragment “−COOH” rejecting structures identical to any in the training set.

4) a) Rank de novo designs for predicted effects on RXRs and PPARs using target prediction methods (SPiDER), shape and charge descriptors to determine the similarity of the designed compounds to known bioactive ligands; b) merge the individual screening lists to obtain a final set of high-scoring hits (49).

5) Select best 5 for synthesis "taking into account their individual in silico ranks and building block availability"; reject any if found in chemical or patent databases to ensure novelty.

6) Test against targets in hybrid reporter gene assays (with comparative controls).

(See the original article for the Results and Discussion)

I think, the results are inspiring and understandable for a non-AI medicinal chemist. Still, knowing a bit more about the basics of machine learning and neural network architectures is important for life science professionals. So here is a very informative educational piece about all these things to gain the high-concept understanding.

Topics: Emerging Technologies   

Share:   Share in LinkedIn  Share in Reddit  Share in X  Share in Hacker News  Share in Facebook  Send by email

You may also be interested to read:



  • Mostapha Benhenda 2018-01-11 22:39

    It’s an interesting experiment indeed, this ETH Zurich team applied the AstraZeneca method. They make an allusion to the diversity problem here:

    "Importantly, the newly generated molecules populate the chemical space of the training data, residing within the RXR/PPAR region of the fine-tuning set (Figure 2)."

    In support of this claim, I only found the sketchy figure 2.

    I didn’t find their quantitative evaluation of diversity, and in particular, how they mathematically define this keyword 'populate’. Tell me if you see something.

    That’s where I said that an important work remains to be done. In this paper, they seem to only show a 2-Dimensional visualization of a 100-something-Dimensional space.

    Visualizations are cool for a popular science blog, but to get a real grasp of what is going on, it’s necessary to introduce equations, and quantitative metrics. Where are they?

    That’s especially important when you want to compare different architectures: for example, to benchmark this AstraZeneca RL method against the Harvard ORGAN.

    Therefore, from the viewpoint of the quantitative evaluation of diversity, this ETH Zurich paper seems much worse than both the AstraZeneca paper and the Harvard paper (but the experimental part looks cool).

  • Alfred Ajami 2018-01-12 16:02

    Thank you, Andrii for calling this one out. BPT provides an opportune forum for further discussion, so allow me to jump in again with perhaps contrarian commentary.

    My skepticism over AI applications to drug discovery (DD) is in part born of a historical bias, dating to the infancy of computational chemistry itself. This field began to blossom 50 years ago not because Marvin Minsky (a personal friend in the Examiner Club) and the Media Lab sent out their students and postdocs to learn organic synthesis but rather because EJ Corey (mentor and task master) sent out his people to learn about parsers, list processors and perceptrons to be morphed into synthons. Marvin was focused on process and EJ was focused on product.

    This then is the metaphor that must govern. AI applications in DD should focus more on quality outcomes, namely molecules that work in the hands of chemists and their biologist friends and not get hung up in process that devolves into proving "my neural net is bigger than yours" regardless of practical outcome. I'll take (seemingly) third rate AI practiced by first rate chemists (Phil Baran, where are you?) than first rate AI practiced by third rate chemists.

    Same deal by analogy to another AI antecedent that few may remember, fuzzy logic, a seminal breakthrough that toppled 2000 years of symbolic logic and crisp set theory. It went nowhere from a lofty intellectual perch of informaticians (despite Lofte Zadeh's rigorous work) until practical engineers, unquestionably not funded by VCs, took lesser, and much criticized implementations of code, and put them into microwaves, washing machines and everything that we own which has been unloaded by robotic cranes on container ships.

    A last point, if AI experts expect to impact DD, they need more visits to the library, studying up on rudimentary cheminformatics. As a consultant to senior management asked to evaluate teams, I am growing increasingly intolerant of AI aspirants to DD who have little grounding in cheminformatics, as for example in fragment propagation, bioisosteric replacement, or in the tools for mapping chemical space, determining 3D molecular complexity, or depicting it in terms of 2D coordinates (ever heard of Normalized PMI Ratios aka NPRs?).

    Bottom line: we all have a lot of work to do, albeit some more than others. I'll do my bit to find more useful case studies. The paper from the Schneider group is a step in the right direction, especially since it makes the case for low-budget, compressed-timeline drug discovery in the hands of actual chemists. And I will even forgive them for being academics!

    • Andrii Buvailo 2018-01-13 12:35

      Very interesting historical remark, actually,

      On the other points, I perfectly agree with you that the nature of the AI method is of secondary importance, as long as it can provide value to the medicinal chemist or other drug discovery practitioner. However, I would like to point out that viewing currently available AI innovations as an alternative to the cheminformatics tools is not precisely what might be the point of the AI value. The idea of incorporating AI-driven analytics into DD is to change the whole process of research, the model of work flow, rather than incrementally improving some particular task for a medicinal chemist, like for example,obtaining "better" results in a docking experiment, compared to using "regular" docking software .

      It is like with the discovery of a personal computer. The fact that the first computer was more powerful than the calculator, was cool. But that was not the point of real breakthrough. Computer provided much more than that. The use of computers allowed for the whole transformatio of the work flow, invention of the Internet, invention of graphical user interface, etc. People now use computers to access and process data in whole new ways, faster, more efficiently, compared to the "pre-computer" era.

      The idea with "AI transforming industry" as I understand it, is to incorporate AI analytics at all stages of drug discovery and connect the dots between all the stages into a self improving loop, which would keep feeding the new data from later stages into the earlier stages to constantly improve result..I found a nice diagram which might be better explaining what I am trying to say: https://www.genengnews.com/gen-news-highlights/neural-networks-identify-drug-mimics-as-potential-antiaging-nutraceuticals/81255217

      • Alfred Ajami 2018-01-14 01:39

        Thanks, Andrii. I appreciate your points. My recounting of the Schneider paper and comments about the role of AI were focused on drug discovery (DD) and specifically on discovery chemistry. I must admit, in connection with another recent AI application in DD covered in LI, https://www.linkedin.com/feed/update/urn:li:activity:6358116177360801792, that even an AI-atheist might blink.

        As to your pointing out the GEN piece, I agree that the diagram does a great job of encapsulating the bigger story.

        But making the case on nutraceuticals as cures in aging would be a pyrrhic victory for AI. I can't even imagine the size or compute the cost of blinded, placebo controlled, randomized, stratified (or multi-disease factorial) and time sequential (because aging has a long progression timeline) pivotal trials (at least 2). The NPV isn't going to be pennies-per-pill, although that surely will be the expected price for ginsenoside or epigallocatechin. Best to leave food as food and not pretend that food will be medicine without meeting the regulatory burden of therapeutic claims, irrespective of AI.

        • Andrii Buvailo 2018-01-14 09:57

          Alfred, you are right, probably not the best example of a use case. The diagram is nice for the purpose of illustrating that at the end of the day, AI will change the process, rather than just become a tool, though. Actually, I recently had another interview with a representative of Cloud Pharmaceuticals. He basically resonated your words from above commentary -- "AI is not the key, the key is good molecules". I am preparing the publication of the interview at BPT in the coming days. It is interestig to see how different people in the industry see the "AI for DD" from different points of view!


Leave a Reply

Your email address will not be published. Required fields are marked *