OUP accepted manuscript

doi:10.1093/bioinformatics/btac006

Open AccessJournal Article10.1093/bioinformatics/btac006

OUP accepted manuscript

06 Jan 2022

- Bioinformatics

- Vol. 38, Iss: 6, pp 1514-1524

131

TL;DR: Wang et al. as mentioned in this paper proposed a novel deep learning framework by utilizing the information bottleneck principle and transfer learning to predict the toxicity of peptides as well as proteins, which achieved a higher prediction performance than state-of-the-art methods on the peptide dataset.

Abstract: Abstract Motivation Recently, peptides have emerged as a promising class of pharmaceuticals for various diseases treatment poised between traditional small molecule drugs and therapeutic proteins. However, one of the key bottlenecks preventing them from therapeutic peptides is their toxicity toward human cells, and few available algorithms for predicting toxicity are specially designed for short-length peptides. Results We present ToxIBTL, a novel deep learning framework by utilizing the information bottleneck principle and transfer learning to predict the toxicity of peptides as well as proteins. Specifically, we use evolutionary information and physicochemical properties of peptide sequences and integrate the information bottleneck principle into a feature representation learning scheme, by which relevant information is retained and the redundant information is minimized in the obtained features. Moreover, transfer learning is introduced to transfer the common knowledge contained in proteins to peptides, which aims to improve the feature representation capability. Extensive experimental results demonstrate that ToxIBTL not only achieves a higher prediction performance than state-of-the-art methods on the peptide dataset, but also has a competitive performance on the protein dataset. Furthermore, a user-friendly online web server is established as the implementation of the proposed ToxIBTL. Availability and implementation The proposed ToxIBTL and data can be freely accessible at http://server.wei-group.net/ToxIBTL. Our source code is available at https://github.com/WLYLab/ToxIBTL. Supplementary information Supplementary data are available at Bioinformatics online.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Journal Article•10.1093/bib/bbac174

OUP accepted manuscript

21 May 2022

- Briefings in Bioinformatics

TL;DR: Raghava et al. as mentioned in this paper developed a web-based tool, ToxinPred2, for predicting the toxicity of proteins based on similarity, motif-based similarity, and prediction models.

...read moreread less

129

Journal Article•10.1093/bib/bbac174

ToxinPred2: an improved method for predicting toxicity of proteins

Neelam Sharma, +3 more

- 21 May 2022

- Briefings in Bioinformatics

TL;DR: A general method developed for predicting the toxicity of proteins regardless of their source of origin, and a hybrid method that combined all three approaches and achieved a maximum area under receiver operating characteristic curve around 0.99.

...read moreread less

51

Journal Article•10.1038/s44222-024-00152-x

Machine learning for antimicrobial peptide identification and design

Fangping Wan, +3 more

- 26 Feb 2024

- Nature Reviews Bioengineering

49

Journal Article•10.1016/j.jmb.2022.167549

THRONE: A New Approach for Accurate Prediction of Human RNA N7-Methylguanosine Sites.

Watshara Shoombuatong, +4 more

- 01 Mar 2022

- Journal of Molecular Biology

TL;DR: THRONE as discussed by the authors employs a wide range of sequence-based features inputted to several ML classifiers and combines these models through ensemble learning to identify m7G sites from the human genome.

...read moreread less

43

Journal Article•10.1101/2023.08.11.552911

ToxinPred 3.0: An improved method for predicting the toxicity of peptides

Anand Singh Rathore, +4 more

- 14 Aug 2023

- bioRxiv

TL;DR: A refined variant of ToxinPred is proposed that showcases improved reliability and accuracy in predicting peptide toxicity, and hybrid or ensemble methods combining two or more models to enhance performance are developed.

...read moreread less

37

...

Expand

References

•Journal Article•10.1093/NAR/25.17.3389

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

Stephen F. Altschul, +6 more

- 01 Sep 1997

- Nucleic Acids Research

TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.

...read moreread less

76.5K

•Journal Article

Visualizing Data using t-SNE

Laurens van der Maaten, +1 more

- 01 Jan 2008

- Journal of Machine Learning Research

TL;DR: A new technique called t-SNE that visualizes high-dimensional data by giving each datapoint a location in a two or three-dimensional map, a variation of Stochastic Neighbor Embedding that is much easier to optimize, and produces significantly better visualizations by reducing the tendency to crowd points together in the center of the map.

...read moreread less

45.8K

•Proceedings Article

LightGBM: a highly efficient gradient boosting decision tree

Guolin Ke, +7 more

- 04 Dec 2017

TL;DR: It is proved that, since the data instances with larger gradients play a more important role in the computation of information gain, GOSS can obtain quite accurate estimation of the information gain with a much smaller data size, and is called LightGBM.

...read moreread less

9.8K

•Journal Article•10.1093/BIOINFORMATICS/BTU031

InterProScan 5: genome-scale protein function classification

Philip Jones, +16 more

- 01 May 2014

- Bioinformatics

TL;DR: A new Java-based architecture for the widely used protein function prediction software package InterProScan is described, resulting in a flexible and stable system that is able to use both multiprocessor machines and/or conventional clusters to achieve scalable distributed data analysis.

...read moreread less

8K

•Journal Article•10.1109/JPROC.2020.3004555

A Comprehensive Survey on Transfer Learning

Fuzhen Zhuang, +7 more

- 01 Jan 2021

TL;DR: Transfer learning aims to improve the performance of target learners on target domains by transferring the knowledge contained in different but related source domains as discussed by the authors, in which the dependence on a large number of target-domain data can be reduced for constructing target learners.

...read moreread less

5.3K