SeqRate: sequence-based protein folding type classification and rates prediction.
TL;DR: SeqRate is the first sequence- based method for protein folding type classification and its accuracy of fold rate prediction is improved over previous sequence-based methods.
read more
Abstract: Background: Protein folding rate is an important property of a protein. Predicting protein folding rate is useful for understanding protein folding process and guiding protein design. Most previous methods of predicting protein folding rate require the tertiary structure of a protein as an input. And most methods do not distinguish the different kinetic nature (two-state folding or multi-state folding) of the proteins. Here we developed a method, SeqRate, to predict both protein folding kinetic type (two-state versus multi-state) and real-value folding rate using sequence length, amino acid composition, contact order, contact number, and secondary structure information predicted from only protein sequence with support vector machines. Results: We systematically studied the contributions of individual features to folding rate prediction. On a standard benchmark dataset, the accuracy of folding kinetic type classification is 80%. The Pearson correlation coefficient and the mean absolute difference between predicted and experimental folding rates (sec -1 ) in the base-10 logarithmic scale are 0.81 and 0.79 for two-state protein folders, and 0.80 and 0.68 for three-state protein folders. SeqRate is the first sequence-based method for protein folding type classification and its accuracy of fold rate prediction is improved over previous sequence-based methods. Its performance can be further enhanced with additional information, such as structure-based geometric contacts, as inputs. Conclusions: Both the web server and software of predicting folding rate are publicly available at http://casp.rnet. missouri.edu/fold_rate/index.html.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Computational and Theoretical Methods for Protein Folding
Mario Compiani,Emidio Capriotti +1 more
TL;DR: It is argued that computation plays not merely an ancillary role but has a more constructive function in that computational work may precede theory and experiments and can provide the primary conceptual clues to inspire subsequent theoretical and experimental work.
The MULTICOM toolbox for protein structure prediction
TL;DR: A comprehensive MULTICOM toolbox consisting of a set of protein structure and structural feature prediction tools, achieving state-of-the-art or near performance, is made freely available for academic use and scientific research.
42
Stepwise optimization of recombinant protein production in Escherichia coli utilizing computational and experimental approaches.
Kulandai Arockia Rajesh Packiam,Ramakrishnan Nagasundara Ramanan,Chien Wei Ooi,Lakshminarasimhan Krishnaswamy,Beng Ti Tey +4 more
TL;DR: A stepwise methodology linking the factors from both levels for optimizing the production of soluble recombinant protein in E. coli is proposed, which can facilitate the optimization of gene- and protein-based factors in silico tools.
41
Solution of Levinthal's Paradox and a Physical Theory of Protein Folding Times.
Dmitry N. Ivankov,Alexei V. Finkelstein +1 more
- 06 Feb 2020
TL;DR: The key ideas and discoveries leading to the current understanding of folding kinetics, including the solution of Levinthal’s paradox are discussed, as well as the current state of the art in the prediction of protein folding times.
35
Folding RaCe: a robust method for predicting changes in protein folding rates upon point mutations
TL;DR: This work has developed a robust knowledge-based methodology to predict the changes in folding rates upon mutations formulated from amino and acid properties using multiple linear regression approach and highlights the importance of outlier detection and studying their implications in the folding mechanism.
29
References
The Protein Data Bank
Helen M. Berman,John D. Westbrook,Zukang Feng,Gary L. Gilliland,Talapady N. Bhat,Helge Weissig,Ilya N. Shindyalov,Philip E. Bourne +7 more
TL;DR: The goals of the PDB are described, the systems in place for data deposition and access, how to obtain further information and plans for the future development of the resource are described.
•Book
An Introduction to Support Vector Machines and Other Kernel-based Learning Methods
Nello Cristianini,John Shawe-Taylor +1 more
- 01 Jan 2000
TL;DR: This is the first comprehensive introduction to Support Vector Machines (SVMs), a new generation learning system based on recent advances in statistical learning theory, and will guide practitioners to updated literature, new applications, and on-line software.
15K
The Protein Data Bank: a computer-based archival file for macromolecular structures.
Frances C. Bernstein,Thomas F. Koetzle,Graheme J. B. Williams,Edgar F. Meyer,Michael D. Brice,John R. Rodgers,O. Kennard,Takehiko Shimanouchi,Mitsuo Tasumi +8 more
TL;DR: The Protein Data Bank is a computer-based archival file for macromolecular structures that stores in a uniform format atomic co-ordinates and partial bond connectivities, as derived from crystallographic studies.
8.7K
An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods
TL;DR: This book is an introduction to support vector machines and related kernel methods in supervised learning, whose task is to estimate an input-output functional relationship from a training set of examples.
Support vector machine learning for interdependent and structured output spaces
Ioannis Tsochantaridis,Thomas Hofmann,Thorsten Joachims,Yasemin Altun +3 more
- 04 Jul 2004
TL;DR: This paper proposes to generalize multiclass Support Vector Machine learning in a formulation that involves features extracted jointly from inputs and outputs, and demonstrates the versatility and effectiveness of the method on problems ranging from supervised grammar learning and named-entity recognition, to taxonomic text classification and sequence alignment.