Please use this identifier to cite or link to this item: https://www.arca.fiocruz.br/handle/icict/46077
Title: De novo design and bioactivity prediction of SARS‑CoV‑2 main protease inhibitors using recurrent neural network‑based transfer learning
Authors: Santana, Marcos V. S.
Silva Jr., Floriano P.
Affilliation: Fundação Oswaldo Cruz. Instituto Oswaldo Cruz. LaBECFar - Laboratório de Bioquímica Experimental e Computacional de Fármacos. Rio de Janeiro, RJ, Brasil.
Fundação Oswaldo Cruz. Instituto Oswaldo Cruz. LaBECFar - Laboratório de Bioquímica Experimental e Computacional de Fármacos. Rio de Janeiro, RJ, Brasil.
Abstract: The global pandemic of coronavirus disease (COVID-19) caused by SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) created a rush to discover drug candidates. Despite the efforts, so far no vaccine or drug has been approved for treatment. Artificial intelligence offers solutions that could accelerate the discovery and optimization of new antivirals, especially in the current scenario dominated by the scarcity of compounds active against SARS-CoV-2. The main protease ( Mpro) of SARS-CoV-2 is an attractive target for drug discovery due to the absence in humans and the essential role in viral replication. In this work, we developed a deep learning platform for de novo design of putative inhibitors of SARS-CoV-2 main protease ( Mpro). Our methodology consists of 3 main steps: (1) training and validation of general chemistry-based generative model; (2) fine-tuning of the generative model for the chemical space of SARS-CoV- Mpro inhibitors and (3) training of a classifier for bioactivity prediction using transfer learning. The fine-tuned chemical model generated > 90% valid, diverse and novel (not present on the training set) structures. The generated molecules showed a good overlap with Mpro chemical space, displaying similar physicochemical properties and chemical structures. In addition, novel scaffolds were also generated, showing the potential to explore new chemical series. The classification model outperformed the baseline area under the precision-recall curve, showing it can be used for prediction. In addition, the model also outperformed the freely available model Chemprop on an external test set of fragments screened against SARS-CoV-2 Mpro, showing its potential to identify putative antivirals to tackle the COVID-19 pandemic. Finally, among the top-20 predicted hits, we identified nine hits via molecular docking displaying binding poses and interactions similar to experimentally validated inhibitors.
Keywords: COVID-19
Ulmfit
Transfer learning
De novo drug design
Generative model
keywords: COVID-19
SARS-CoV-2
Aprendizado de transferência
Modelo generativo
Ulmfit
Issue Date: 2021
Publisher: BMC
Citation: SANTANA, Marcos V. S.; SILVA JR., Floriano P. De novo design and bioactivity prediction of SARS‑CoV‑2 main protease inhibitors using recurrent neural network‑based transfer learning. BMC Chemistry, v. 15, n. 8, p. 1-20, 2021.
DOI: 10.1186/s13065-021-00737-2
ISSN: 2661-801X
Copyright: open access
Appears in Collections:IOC - Artigos de Periódicos
Files in This Item:
File Description SizeFormat 
Santana_Marcos_etal_IOC_2021_COVID-19.pdf5.47 MBAdobe PDFView/Open



FacebookTwitterDeliciousLinkedInGoogle BookmarksBibTex Format mendeley Endnote DiggMySpace

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.