Who would have thought that thanks to deep learning we could predict the properties of drugs that do not yet exist? This is of great importance to the pharmaceutical industry.
With regard to artificial intelligence, complaints can sound something like this: “Almost eight years have passed since the invention of the AlexNet neural network [прим. переводчика: в 2012 году Алексей Крижевский опубликовал дизайн сверточной нейросети AlexNet, которая с большим отрывом победила в соревновании ImageNet ], so where is my self-driving car? ” Indeed, it may seem that the expectations of the mid-2010s have not been met. Among pessimists, predictions about
The purpose of this essay is to discuss the significant progress of machine learning in the real-world drug discovery challenge. I want to remind you of another old adage, this time from AI researchers. To rephrase slightly, it sounds like this: “AI is called AI until it works, then it’s just software.”
What until a few years ago was considered cutting-edge fundamental research in machine learning is now often referred to as “just data science” (or even analytics) – and is revolutionizing the pharmaceutical industry. There is a solid chance that the use of deep learning to discover drugs will dramatically change our lives for the better.
In the 1980s, there was a shift towards
Convolutional neural networks were first used to analyze biomedical images in 1995, when
Fast forward to 2012. Convolutional neural networks made a splash with the arrival of the AlexNet system, which led to a leap in performance of the now famous ImageNet dataset. The success of AlexNet, a network with five convolutional and three tightly coupled layers trained on game GPUs, has become so famous in machine learning that people are now talking about “moments of ImageNet»In different niches of machine learning and AI.
For example, “Natural Language Processing may have outlived its ImageNet moment with the development of large transformers in 2018” or “Reinforcement Learning is still waiting for its ImageNet moment.”
Almost ten years have passed since AlexNet. Computer vision and deep learning models are gradually improving. Applications have gone beyond the classification. Today they have learned how to segment images, estimate depth, and automatically reconstruct 3D scenes from multiple 2D images. And this is not a complete list of their capabilities.
Deep learning for biomedical imaging analysis has become a hot area of research. A side effect is an inevitable increase in noise. Published in 2019
Most of them have not made any contributions to basic science or machine learning. A passion for deep learning has gripped academic researchers who had previously shown no interest in it, and for good reason. It can do the same thing as classical computer vision algorithms (see.
The cost of developing a new drug can reach
It is also leading to a spike in morbidity in the aptly named category of “neglected diseases”, including a disproportionate number of
One such startup in Salt Lake City, Utah is trying to do just that. Founders
By the end of 2019, the company had completed
The workflow described in this article is heavily based on information from the official docs [
Other startups in this area include
Next, I’ll describe the image analysis process and how deep learning fits into the rare disease drug discovery workflow. We will look at a high-level process that is applicable to a variety of other areas of drug discovery.
For example, it can be easily used to screen cancer drugs for their effect on tumor cell morphology. Perhaps even to analyze the response of cells of specific patients to different drug options. This approach uses concepts from
Maybe climate control in a laboratory works differently in summer and winter? Maybe someone had lunch next to the slides before inserting them into the microscope? Maybe the supplier of one of the ingredients of the culture medium has changed? Or has the supplier changed its own supplier? A huge number of variables affect the result of an experiment. Tracking and highlighting unintentional noise is one of the main challenges in data-driven drug discovery.
Microscopic images can be very different in the same experiments. The brightness of the image, the shape of cells, the shape of organelles, and many other characteristics change due to the corresponding physiological effects or random errors.
So, the images in the figure below are obtained from the same
These mutations can be mimicked by suppressing gene expression with
By learning from thousands of mutations instead of a singular cellular model of a specific disease, the neural network learns to code phenotypes in a multidimensional hidden space. The resulting code makes it possible to evaluate drugs by their ability to bring the disease phenotype closer to a healthy phenotype, each of which is represented by a multidimensional set of coordinates. Likewise, the side effects of drugs can be embedded in the encoded representation of the phenotype, and drugs are evaluated not only for the disappearance of symptoms of the disease, but also for minimizing harmful side effects.
Also, this image-based drug discovery method works well with the same DenseNet or ResNet architecture with hundreds of layers, which provides optimal performance on datasets like ImageNet.
Layer activation values encoded in a multidimensional space reflect phenotype, disease pathogenesis, relationships between treatments, side effects, and other ailments. Therefore, all these factors can be analyzed by displacement in the coded space. This phenotypic code can be subjected to special regularization (for example, by minimizing covariance between different activations of layers) to reduce coding correlations or for other purposes.
The figure below shows a simplified model. Black arrows represent the operations of convolution + pooling. Blue lines represent tight connections. For simplicity, the number of layers has been reduced and residual connections are not shown.
The effectiveness of this approach has been proven. We are seeing significant research progress and several drugs are already in the first phase of clinical trials. For example, teams of just a few hundred scientists and engineers at companies such as Recursion Pharmaceuticals achieve this. Other startups are close by: TwoXAR has several drug candidates undergoing preclinical trials in other categories of diseases.
The deep learning and computer vision approach to drug development can be expected to have a significant impact on large pharmaceutical companies and healthcare in general. We will soon see how this will affect the development of new treatments for common diseases (including heart disease and diabetes), as well as rare ailments that have remained out of sight to this day.
ORIGINAL PAGE –