Chemical Data Has Problems
The state of data access, quality and dissemination in Chemistry is extremely poor - so poor that it is blocking advances in machine learning (ML) and artificial intelligence (AI), and also impeding research and development in traditional methods. The recent surge in AI skepticism is a direct consequence of years of over-hype and promises based on precarious data. Over-the-top expectation were offered without enough consideration for the data quality and volume required to train fancy algorithms. The old adage “^&$% in, ^&$% out” holds true (we can say ‘crap’ right?). This opinion is in line with recent statements by the CEO of Novartis, for example, who runs the second largest pharmaceutical company in the world, lamenting the difficulty in accessing quality datasets to make AI effective.