PIGNet: A physics-informed deep learning model toward generalized drug-target interaction predictions.

Open AccessPosted Content

PIGNet: A physics-informed deep learning model toward generalized drug-target interaction predictions.

- 22 Aug 2020

139

TL;DR: A physics-informing strategy is proposed to predict the atom–atom pairwise interactions via physics-informed equations parameterized with neural networks and provides the total binding affinity of a protein–ligand complex as their sum.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Figures

Fig. 2 The training scheme of PIGNet. We use three types of data in model training - true binding complex, true binder ligand-protein pair in computer-generated binding pose, and non-binding decoy complex. PIGNet predicts binding free energy for each input. For a true binding complex, the model learns to predict its true binding energy. The model also learns to predict the energy of a computergenerated binding pose complex or a non-binding decoy complex in higher value than the true binding energy and threshold energy, respectively. Finally, PIGNet learns the proper correlation of ligand atom position and binding affinity by minimizing the derivative loss.

Fig. 3 Interpretation of the predicted outcomes. a. Substructural analysis of ligands for two target proteins. Protein-tyrosine phosphatase non-receptor type 1 (PTPN1) and platelet activating factor acetylhydrolase (PAF-AH). The blue and red circles indicate common and different substructures, respectively, and the predicted energy contribution (unit: kcal/mol) of each substructure is annotated. The inhibitory constant, Ki, indicates how potent the ligand binds to the target protein. b. A distance-energy plot of carbon-carbon pairwise van der Waals (VDW) energy components in the test set. The red solid line illustrates the original distance-energy relation without any deviation induced by learnable parameters. The closer the color of a data point to yellow, the larger the number of corresponding carbon-carbon pairs. c. The average value of the corrected sum of VDW radii, d′i j, corresponding to different carbon-carbon pair types. Csp2 −Csp2 , Csp2 −Csp3 , and Csp3 −Csp3 pairs are compared. The results include 95% confidence intervals.

Table 2 The CASF-2016 benchmark results for the 3D GNN-based model and PIGNet (Single) with and without using data augmentation. The highest values within the same model are shown in bold

Fig. 1 Our model architecture. A protein-ligand complex is represented in a graph and adjacency matrices are assigned from the binding structure of the complex. Each node feature is updated through neural networks to carry the information of covalent bonds and intermolecular interactions. Given the distance and final node features of each atom pair, four energy components are calculated from the physics-informed parameterized equations. The total binding affinity is obtained as a sum of pairwise binding affinities, which is a sum of the four energy components divided by an entropy term.

Fig. 4 Plot of the average Pearson’s correlation coefficients, R, of the 5-fold PIGNet model, with or without the uncertainty estimator, on the datasets classified according to the total uncertainty. PIGNet with the uncertainty estimator - low: the lowest third, random: the randomly selected one third, high: the highest third of the uncertainty distribution. PIGNet without Monte Carlo dropout - baseline: The scores of a single PIGNet model shown in the table 1. Error bars represent 95% confidence intervals. PIGNet was tested at the 2,300th training epoch with and without Monte Carlo dropout.

Table 1 Benchmark test results on the CASF-2016 and the CSAR NRC-HiQ dataset. R, ρ indicate Pearson correlation coefficient and Spearman’s rank correlation coefficient, respectively. Top 1 score was used for a docking success rate, and top 1% rate was used for an average EF and a screening success rate. ∆VinaRF20 63 was excluded from the comparison, as it was fine-tuned on the PDBbind 2017 data, which in fact includes ∼ 50% of data in the CASF-2016 test set. The highest values of each column are shown in bold

Citations

Proceedings Article•10.48550/arXiv.2210.01776

DiffDock: Diffusion Steps, Twists, and Turns for Molecular Docking

Gabriele Corso, +4 more

- 04 Oct 2022

TL;DR: DiffDock as mentioned in this paper proposes a diffusion generative model over the non-Euclidean manifold of ligand poses to predict the binding structure of a small molecule ligand to a protein.

...read moreread less

256

•Posted Content•10.1101/2022.06.06.495043

TANKBind: Trigonometry-Aware Neural NetworKs for Drug-Protein Binding Structure Prediction

Wei Lu, +5 more

- 25 Oct 2022

- bioRxiv

TL;DR: This paper proposes Trigonometry-Aware Neural networKs for binding structure prediction, TANKBind, that builds trigonometry constraint as a vigorous inductive bias into the model and explicitly attends to all possible binding sites for each protein by segmenting the whole protein into functional blocks.

...read moreread less

143

Journal Article•10.1021/acs.jmedchem.2c00991

Boosting Protein-Ligand Binding Pose Prediction and Virtual Screening Based on Residue-Atom Distance Likelihood Potential and Graph Transformer.

Chao Shen, +8 more

- 02 Aug 2022

- Journal of Medicinal Chemistry

TL;DR: In this article , a novel scoring function named RTMScore was developed by introducing a tailored residue-based graph representation strategy and several graph transformer layers for the learning of protein and ligand representations, followed by a mixture density network to obtain residue-atom distance likelihood potential.

...read moreread less

106

•Journal Article•10.3389/fbinf.2022.885983

Scoring Functions for Protein-Ligand Binding Affinity Prediction Using Structure-based Deep Learning: A Review

Rocco Meli, +2 more

- 17 Jun 2022

- Frontiers in bioinformatics

TL;DR: This work reviews structure-based scoring functions for binding affinity prediction based on deep learning, focussing on different types of architectures, featurization strategies, data sets, methods for training and evaluation, and the role of explainable artificial intelligence in building useful models for real drug-discovery applications.

...read moreread less

95

•Journal Article•10.1016/j.sbi.2023.102548

Structure-based drug design with geometric deep learning

Gisbert Schneider

- 01 Apr 2023

- Current Opinion in Structural Biology

TL;DR: Geometric deep learning, an emerging concept of neural-network-based machine learning, has been applied to macromolecular structures as mentioned in this paper , highlighting its potential for structure-based drug discovery and design.

...read moreread less

82

...

Expand

References

•Journal Article•10.1002/JCC.21334

AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading

Oleg Trott, +1 more

- 04 Jun 2009

- Journal of Computational Chemistry

TL;DR: AutoDock Vina achieves an approximately two orders of magnitude speed‐up compared with the molecular docking software previously developed in the lab, while also significantly improving the accuracy of the binding mode predictions, judging by tests on the training set used in AutoDock 4 development.

...read moreread less

30.1K

•Posted Content

Empirical evaluation of gated recurrent neural networks on sequence modeling

Junyoung Chung, +5 more

- 11 Dec 2014

- arXiv: Neural and Evolutionary Computing

TL;DR: These advanced recurrent units that implement a gating mechanism, such as a long short-term memory (LSTM) unit and a recently proposed gated recurrent unit (GRU), are found to be comparable to LSTM.

...read moreread less

14.1K

•Journal Article•10.1093/BIOINFORMATICS/BTL158

Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences

Weizhong Li, +1 more

- 01 Jul 2006

- Bioinformatics

TL;DR: Cd-hit-2d compares two protein datasets and reports similar matches between them; cd- Hit-est clusters a DNA/RNA sequence database and cd- hit-est-2D compares two nucleotide datasets.

...read moreread less

10.7K

•Journal Article•10.1093/BIOINFORMATICS/BTS565

Cd-hit

Limin Fu, +4 more

- 01 Dec 2012

- Bioinformatics

TL;DR: A new CD-HIT program accelerated with a novel parallelization strategy and some other techniques to allow efficient clustering of such datasets to reduce sequence redundancy and improve the performance of other sequence analyses is developed.

...read moreread less

9.2K

Book Chapter•10.7551/mitpress/1120.001.0001

Advances in Neural Information Processing Systems 14

08 Nov 2002

Abstract: The proceedings of the 2001 Neural Information Processing Systems (NIPS) Conference. The annual conference on Neural Information Processing Systems (NIPS) is the flagship conference on neural computation. The conference is interdisciplinary, with contributions in algorithms, learning theory, cognitive science, neuroscience, vision, speech and signal processing, reinforcement learning and control, implementations, and diverse applications. Only about 30 percent of the papers submitted are accepted for presentation at NIPS, so the quality is exceptionally high. These proceedings contain all of the papers that were presented at the 2001 conference. Bradford Books imprint

...read moreread less

8.9K