MaxSubSeq: an algorithm for segment-length optimization. The case study of the transmembrane spanning segments.

doi:10.1093/BIOINFORMATICS/BTG023

Open AccessJournal Article10.1093/BIOINFORMATICS/BTG023

MaxSubSeq: an algorithm for segment-length optimization. The case study of the transmembrane spanning segments.

Piero Fariselli, +5 more

- 01 Mar 2003

- Bioinformatics

- Vol. 19, Iss: 4, pp 500-505

34

TL;DR: This paper describes a general dynamic programming-like algorithm specifically designed to optimize the number and length of segments with constrained length in a given protein sequence, and presents the detailed description of MaxSubSeq.

Abstract: MOTIVATION A problem in predicting the topography of transmembrane proteins is the optimal localization of the transmembrane segments along the protein sequences, provided that each residue is associated with a propensity of being or not being included in the transmembrane protein region. From previous work it is known that post-processing of propensity signals with suited algorithms can greatly improve the quality and the accuracy of the predictions. In this paper we describe a general dynamic programming-like algorithm (MaxSubSeq, Maximal SubSequence) specifically designed to optimize the number and length of segments with constrained length in a given protein sequence. Previous application of our algorithm, has proved its effectiveness in the optimization task of both neural network and hidden Markov models output, and in this paper we present the detailed description of MaxSubSeq. RESULTS We describe the application of MaxSubSeq to the location of both helical and beta strand transmembrane segments, optimizing the outputs derived with different predictive algorithms. For all-alpha transmembrane proteins we use both the standard Kyte-Doolittle (KD) hydropathy scale and the TMHMM predictor (http://www.cbs.dtu.dk/). Using a set of 188 well characterized membrane proteins, MaxSubSeq nearly doubles the correct location of transmembrane segments as compared to the standard KD hydrophobicity plot, reaching 51% accuracy. If MaxSubSeq is used to optimize the TMHMM method the accuracy increases from 68 to 72%. When used to regularize the prediction of beta transmembrane strands, obtained using both a neural network and a HMM based predictors, MaxSubSeq increases the accuracy per protein up to 72 and 73% respectively. AVAILABILITY The program is available upon request to the authors, or it is accessible through our web server (http://gpcr.biocomp.unibo.it/predictors/)

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Journal Article•10.1186/1471-2105-6-7

Evaluation of methods for predicting the topology of β-barrel outer membrane proteins and a consensus prediction method

Pantelis G. Bagos, +2 more

- 12 Jan 2005

- BMC Bioinformatics

TL;DR: The consensus prediction method described in this work, optimizes the predicted topology with a dynamic programming algorithm and is implemented in a web-based application freely available to non-commercial users at http://bioinformatics.uoa.gr/ConBBPRED.

...read moreread less

139

•Journal Article•10.1002/JCC.10411

SVMtm: support vector machines to predict transmembrane segments.

Zheng Yuan, +2 more

- 15 Apr 2004

- Journal of Computational Chemistry

TL;DR: A new method has been developed for prediction of trans Membrane helices using support vector machines that can distinguish transmembrane proteins from soluble proteins with an accuracy of ∼99% and can be used for consensus analysis of entire proteomes.

...read moreread less

99

•Journal Article•10.1093/BIOINFORMATICS/BTG1027

An ENSEMBLE machine learning approach for the prediction of all-alpha membrane proteins.

Pier Luigi Martelli, +2 more

- 03 Jul 2003

- Bioinformatics

TL;DR: ENSEMBLE scores with a per-protein accuracy of 90% for topography and 71% forTopology, outperforming the best single method of 7 and 5 percentage points, respectively, and is higher than that of the best predictors presently available on the Web.

...read moreread less

93

•Journal Article•10.1186/1471-2105-7-189

Algorithms for incorporating prior topological information in HMMs: application to transmembrane proteins.

Pantelis G. Bagos, +2 more

- 05 Apr 2006

- BMC Bioinformatics

TL;DR: A simple method is presented that allows incorporation of prior topological information concerning the sequences at hand, while at the same time the HMMs retain their full probabilistic interpretation in terms of conditional probabilities, and compares well against already established algorithms presented in the literature.

...read moreread less

77

Sequential and Parallel Algorithms for the Generalized Maximum Subarray Problem

Sung Eun Bae

- 01 Jan 2007

TL;DR: This thesis explores various techniques to speed up the computation, and several new algorithms for the maximum subarray problem, and investigates a speed-up option through parallel computation.

...read moreread less

29

...

Expand

References

Journal Article•10.1016/0022-2836(82)90515-0

A simple method for displaying the hydropathic character of a protein

Jack Kyte, +1 more

- 05 May 1982

- Journal of Molecular Biology

TL;DR: A computer program that progressively evaluates the hydrophilicity and hydrophobicity of a protein along its amino acid sequence has been devised and its simplicity and its graphic nature make it a very useful tool for the evaluation of protein structures.

...read moreread less

23.9K

Journal Article•10.1006/JMBI.2000.4315

Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes

Anders Krogh, +3 more

- 19 Jan 2001

- Journal of Molecular Biology

TL;DR: A new membrane protein topology prediction method, TMHMM, based on a hidden Markov model is described and validated, and it is discovered that proteins with N(in)-C(in) topologies are strongly preferred in all examined organisms, except Caenorhabditis elegans, where the large number of 7TM receptors increases the counts for N(out)-C-in topologies.

...read moreread less

13K

•Book

Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids

Richard Durbin, +3 more

- 01 Feb 2005

TL;DR: This book gives a unified, up-to-date and self-contained account, with a Bayesian slant, of such methods, and more generally to probabilistic methods of sequence analysis.

...read moreread less

4.5K

•Journal Article•10.1093/NAR/27.1.49

The SWISS-PROT protein sequence data bank and its supplement TrEMBL in 1999.

Amos Marc Bairoch, +1 more

- 01 Jan 1998

- Nucleic Acids Research

TL;DR: The Human Proteomics Initiative (HPI), a major project to annotate all known human sequences according to the quality standards of SWISS-PROT, is described.

...read moreread less

3.6K

•Journal Article•10.1093/BIOINFORMATICS/17.7.646

Evaluation of methods for the prediction of membrane spanning regions

Steffen Möller, +2 more

- 01 Jul 2001

- Bioinformatics

TL;DR: This work presents an evaluation of the performance of the currently best known and most widely used methods for the prediction of transmembrane regions in proteins and shows that TMHMM is currently the best performing trans Membrane prediction program.

...read moreread less

1.2K