MaxSubSeq: an algorithm for segment-length optimization. The case study of the transmembrane spanning segments.
34
TL;DR: This paper describes a general dynamic programming-like algorithm specifically designed to optimize the number and length of segments with constrained length in a given protein sequence, and presents the detailed description of MaxSubSeq.
read more
Abstract: MOTIVATION A problem in predicting the topography of transmembrane proteins is the optimal localization of the transmembrane segments along the protein sequences, provided that each residue is associated with a propensity of being or not being included in the transmembrane protein region. From previous work it is known that post-processing of propensity signals with suited algorithms can greatly improve the quality and the accuracy of the predictions. In this paper we describe a general dynamic programming-like algorithm (MaxSubSeq, Maximal SubSequence) specifically designed to optimize the number and length of segments with constrained length in a given protein sequence. Previous application of our algorithm, has proved its effectiveness in the optimization task of both neural network and hidden Markov models output, and in this paper we present the detailed description of MaxSubSeq. RESULTS We describe the application of MaxSubSeq to the location of both helical and beta strand transmembrane segments, optimizing the outputs derived with different predictive algorithms. For all-alpha transmembrane proteins we use both the standard Kyte-Doolittle (KD) hydropathy scale and the TMHMM predictor (http://www.cbs.dtu.dk/). Using a set of 188 well characterized membrane proteins, MaxSubSeq nearly doubles the correct location of transmembrane segments as compared to the standard KD hydrophobicity plot, reaching 51% accuracy. If MaxSubSeq is used to optimize the TMHMM method the accuracy increases from 68 to 72%. When used to regularize the prediction of beta transmembrane strands, obtained using both a neural network and a HMM based predictors, MaxSubSeq increases the accuracy per protein up to 72 and 73% respectively. AVAILABILITY The program is available upon request to the authors, or it is accessible through our web server (http://gpcr.biocomp.unibo.it/predictors/)
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Evaluation of methods for predicting the topology of β-barrel outer membrane proteins and a consensus prediction method
TL;DR: The consensus prediction method described in this work, optimizes the predicted topology with a dynamic programming algorithm and is implemented in a web-based application freely available to non-commercial users at http://bioinformatics.uoa.gr/ConBBPRED.
SVMtm: support vector machines to predict transmembrane segments.
TL;DR: A new method has been developed for prediction of trans Membrane helices using support vector machines that can distinguish transmembrane proteins from soluble proteins with an accuracy of ∼99% and can be used for consensus analysis of entire proteomes.
99
An ENSEMBLE machine learning approach for the prediction of all-alpha membrane proteins.
TL;DR: ENSEMBLE scores with a per-protein accuracy of 90% for topography and 71% forTopology, outperforming the best single method of 7 and 5 percentage points, respectively, and is higher than that of the best predictors presently available on the Web.
Algorithms for incorporating prior topological information in HMMs: application to transmembrane proteins.
TL;DR: A simple method is presented that allows incorporation of prior topological information concerning the sequences at hand, while at the same time the HMMs retain their full probabilistic interpretation in terms of conditional probabilities, and compares well against already established algorithms presented in the literature.
Sequential and Parallel Algorithms for the Generalized Maximum Subarray Problem
Sung Eun Bae
- 01 Jan 2007
TL;DR: This thesis explores various techniques to speed up the computation, and several new algorithms for the maximum subarray problem, and investigates a speed-up option through parallel computation.
References
A simple method for displaying the hydropathic character of a protein
Jack Kyte,Russell F. Doolittle +1 more
TL;DR: A computer program that progressively evaluates the hydrophilicity and hydrophobicity of a protein along its amino acid sequence has been devised and its simplicity and its graphic nature make it a very useful tool for the evaluation of protein structures.
23.9K
Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes
TL;DR: A new membrane protein topology prediction method, TMHMM, based on a hidden Markov model is described and validated, and it is discovered that proteins with N(in)-C(in) topologies are strongly preferred in all examined organisms, except Caenorhabditis elegans, where the large number of 7TM receptors increases the counts for N(out)-C-in topologies.
13K
•Book
Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids
Richard Durbin,Sean R. Eddy,Anders Krogh,Graeme Mitchison +3 more
- 01 Feb 2005
TL;DR: This book gives a unified, up-to-date and self-contained account, with a Bayesian slant, of such methods, and more generally to probabilistic methods of sequence analysis.
4.5K
The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1999.
Amos Marc Bairoch,Rolf Apweiler +1 more
TL;DR: The Human Proteomics Initiative (HPI), a major project to annotate all known human sequences according to the quality standards of SWISS-PROT, is described.
Evaluation of methods for the prediction of membrane spanning regions
TL;DR: This work presents an evaluation of the performance of the currently best known and most widely used methods for the prediction of transmembrane regions in proteins and shows that TMHMM is currently the best performing trans Membrane prediction program.