The aging of the world population, the growing burden of chronic and infectious diseases and the emergence of new pathogens have made the need for new treatments more urgent than ever. Yet discovering a new drug and bringing it to market is a long, arduous and costly journey marked by many failures and few successes.
Artificial intelligence has long been seen as the solution to overcoming some of these obstacles due to its ability to analyze vast amounts of data, uncover patterns and relationships, and predict effects.
But despite its enormous potential, AI has yet to deliver on its promise to transform drug discovery.
Today, a multi-institutional team led by Harvard Medical School biomedical informatician Marinka Zitnik launched a platform that aims to optimize AI-based drug discovery by developing more realistic and more faithful algorithms.
Therapeutics Data Commons, described in a recent comment in Nature Chemistry Biologyis an open-access platform that serves as a bridge between computer scientists and machine learning researchers on the one hand and biomedical researchers, biochemists, clinical researchers and drug designers on the other – communities that have traditionally worked in isolation from each other.
The platform offers both dataset curation and algorithm design and performance evaluation for multiple treatment modalities, including small molecule drugs, antibodies, and cell and gene therapies, at all stages of drug development, from the identification of chemical compounds to the performance of drugs in clinical trials.
Zitnik, an assistant professor of biomedical informatics at HMS’s Blavatnik Institute, conceptualized the platform and is now leading the work in collaboration with researchers from MIT, Stanford University, Carnegie Mellon University, Georgia Tech, University of Illinois-Urbana Champaign, and Cornell University. .
She recently discussed Therapeutics Data Commons with Harvard Medicine News.
HMNews: What are the main challenges in drug discovery and how can AI help solve them?
Zitnik: Developing a drug from scratch that is both safe and effective is an incredible challenge. On average, it takes between 11 and 16 years and between 1 and 2 billion dollars to do so. Why is that?
It is very difficult to determine early on whether an initially promising chemical compound would produce results in human patients consistent with the results it shows in the laboratory. The number of small molecule compounds is 10 to the 60th power, but only a tiny fraction of this astronomically large chemical space has been prospected for molecules with medicinal properties. Despite this, the impact of existing therapies on treating the disease has been staggering. We believe that new algorithms coupled with automation and new datasets can find many more molecules that can translate to improved human health.
AI algorithms can help us determine which of these molecules are most likely to be safe and effective human therapies. This is the ultimate problem plaguing the development of drug discovery. Our vision is that machine learning models can help sift through and integrate large amounts of biochemical data that we can connect more directly to molecular and genetic information, and ultimately to individualized patient outcomes.
HMNews: How close is AI to making that promise a reality?
Zitnik: We’re not there yet. There are a number of challenges, but I would say the most important is to understand how well our current algorithms perform and whether their performance translates into real problems.
When we evaluate new AI models through computer modeling, we test them against benchmark datasets. Increasingly, we see in the publications that these models achieve near perfect accuracy. If so, why aren’t we seeing the widespread adoption of machine learning in drug discovery?
Indeed, there is a big gap between performing well on a benchmark data set and being ready to transition to real-world implementation in a biomedical or clinical setting. The data on which these models are trained and tested is not indicative of the type of challenges these models are exposed to when used in real practice, so it is very important to close this gap.
HMNews: Where does the Therapeutics Data Commons platform come from?
Zitnik: The purpose of Therapeutics Data Commons is precisely to address these challenges. It serves as a meeting point between the machine learning community on one side and the biomedical community on the other. This can help the machine learning community with algorithmic innovation and make these models more translatable to real-world scenarios.
HMNews: Could you explain how it actually works?
Zitnik: First of all, keep in mind that the drug discovery process runs the gamut, from initial drug design based on data from chemistry and chemical biology, to preclinical research based on on data from animal studies, and up to clinical research in humans. the patients. The machine learning models we train and evaluate as part of the platform use different types of data to support the development process at all these different stages.
For example, machine learning models that support small molecule drug design typically rely on large datasets of molecular graphs, i.e. the structures of chemical compounds and their molecular properties. These models find patterns in the known chemical space that relate parts of the chemical structure to the chemical properties necessary for drug safety and efficacy.
Once an AI model is trained to identify these telltale patterns in the known subset of chemicals, it can be deployed and can search for the same patterns in the large yet untested chemical datasets and do predictions of how these chemicals would work. .
To design models that can aid in late-stage drug discovery, we train them on data from animal studies. These models are trained to search for patterns that relate biological data to probable clinical outcomes in humans.
We can also ask if a model can look for molecular signatures in chemical compounds that correlate with patient information to identify which subset of patients are most likely to respond to a chemical compound.
HMNews: Who are the contributors and end users of this platform?
Zitnik: We have a team of volunteer students, scientists and experts who come from partner universities and industry, including small start-ups in the Boston area as well as large pharmaceutical companies in the United States. United and Europe. Computer scientists and biomedical researchers contribute their expertise in the form of state-of-the-art machine learning models and preprocessed and curated datasets that are standardized so they can be published and are ready for use by others.
Thus, the platform contains both analysis-ready datasets and machine learning algorithms, as well as robust metrics that tell us how a machine learning model is performing on a dataset. specific.
Our end users are researchers from all over the world. We host webinars to showcase new features, get feedback, and answer questions. We offer tutorials. This ongoing training and feedback is really crucial.
We have 4,000-5,000 active users every month, most of them in the US, Europe and Asia. Overall, we saw over 65,000 downloads of our machine learning algorithm set/data set. We saw over 160,000 uploads of harmonized and standardized datasets. The numbers are increasing and we hope they will continue to increase.
HMNews: What are the long-term goals of Therapeutics Data Commons?
Zitnik: Our mission is to support AI drug discovery on two fronts. First, in the design and testing of machine learning methods at all stages of drug discovery and development, from chemical compound identification and drug design to clinical research.
Second, to support the design and validation of machine learning algorithms in several therapeutic modalities, especially newer ones, including biologics, vaccines, antibodies, mRNA-based drugs, protein therapies, and gene therapies.
Machine learning offers a tremendous opportunity to contribute to these new therapies, and we have yet to see the use of AI in these areas to the extent that we have seen it in small molecule research, on which much of the attention is today. This gap is primarily due to a dearth of standardized AI-ready datasets for these new therapeutic modalities, which we hope to fill with Therapeutics Data Commons.
HMNews: What sparked your interest in this work?
Zitnik: I’ve always been interested in understanding and modeling the interactions between complex systems, that is, multi-component systems that interact with each other in a non-dependent way. It turns out that many problems in therapeutic science are, by definition, precisely such complex systems.
We have a protein target which is a complex three-dimensional structure, we have a small molecule compound which is a complex graph of atoms and bonds between those atoms, and then we have a patient, whose description and health status are given in the form of a multi-scale representation. This is a classic complex system problem, and I really enjoy researching and finding ways to normalize and “tame” these complex interactions.
Therapeutic science is full of these kinds of problems that are ripe to benefit from machine learning. That’s what we’re looking for, that’s what we’re looking for.
Kexin Huang et al, Artificial Intelligence Foundation for Therapeutic Science, Nature Chemistry Biology (2022). DOI: 10.1038/s41589-022-01131-2
Provided by Harvard Medical School
Quote: Can AI transform the way we discover new medicines? (2022, November 16) retrieved November 16, 2022 from https://phys.org/news/2022-11-ai-drugs.html
This document is subject to copyright. Except for fair use for purposes of private study or research, no part may be reproduced without written permission. The content is provided for information only.
#transform #discover #medicines