COMPREHENSIVE STUDY OF GENERATIVE METHODS ON DRUG DISCOVERY
Abstract
Observing the recent success of the deep learning (DL) technology in multiple life-changing application areas, e.g., autonomous driving, image/video search and discovery, natural language processing, etc., many new opportunities have presented themselves. One of the biggest ones lies in applying DL in accelerating the drug discovery, where millions of human lives could potentially be saved. However, applying DL into the drug discovery task turns out to be non-trivial. The most successful DL methods take fix-sized tensors/matrices, e.g., images, or sequences of tokens, e.g., sentences with variant numbers of words, as their inputs. However, none of these registers with the inputs of drug discovery, i.e., chemical compounds. Due to the structural nature of the chemical compounds, the graph data structure is often used to represent the atomic data for the compound. Seen as a great opportunity for improvement, deep learning on graph techniques are being actively studied lately. In this paper, we survey the newest academic progress in generative deep learning methods on graphs for drug discovery applications. We will focus our study by narrowing down our scope to one of the most important deep learning generative model, namely Variational AutoEncoder (VAE). We start our survey introduction by dating back to the stage when each molecule atom is treated completely separately and their structural information is completely ignored in VAE. This method is quite limited given their structure information is scraped. We hence introduce the baseline method Grammar Variational AutoEncoder (GVAE) where the chemical representation grammar information is encoded in the modeling. One improvement upon the GVAE is by ensuring the syntax validation in the decoder. This method is named Syntax-Directed Variational AutoEncoder (SDVAE). Since then, a couple of variants of these methods have bloomed. One of them is by encoding and decoding the molecules in two steps, one being junction tree macrostructure with chemical sub-components as the minimum unit and the other one being the microstructure with atom as the minimum unit. This method is named Junction Tree Variational Au-toEncoder (JTVAE). Finally, we introduce another method named GraphVAE where the predefined maximum atom number is enforced in the decoder. Those methods turn out to be effective in avoiding generating invalid molecules. We show the effectiveness of all the methods in extensive experiments. In conclusion, the light of hope has been lit in the drug discovery area with deep learning techniques when a ton of opportunities for growth are still open.
Related items
Showing items related by title, author, creator and subject.
-
Identifying Associative Memory Deficits And Neurobiological Correlates Of Encoding And Performance In A National Sample Of Veterans With Gulf War Illness Using Magnetic Resonance Imaging
Cooper Cortes, Crystal Marie (Psychology, 2012-07-25)Roughly 26-32% of U.S. veterans who served in the Persian Gulf War of 1991 report suffering from chronic health problems (Golomb, 2008). Memory complaints are regularly reported by ill Gulf War veterans (GWV), but there ... -
Low Complexity H.264 Encoder Using Machine Learning.
Purushotham, Thejaswini (Electrical Engineering, 2011-03-03)H.264 is currently one of the most widely accepted video coding standards in the industry. Several software and hardware solutions for the H.264 video encoder exist in the market at present. H.264 is used in such applications ... -
Complexity Reduction In H.264 Encoder Using Open Multiprocessing
Sathe, Tejas Pravin (Electrical Engineering, 2012-07-25)H.264 video standard developed by Joint Video Team has proven dramatic improvements in bit-rate efficiency, compression ratio, video quality and error resilience. But, all this is achieved at the expense of more than four ...