Construction of a dictionary of sequence motifs that characterize groups of related proteins.
49
TL;DR: The sequence motifs identified represent functionally important sites on protein molecules and have a practical relevance in identifying the membership of specific superfamilies without the need to perform sequence database searches in 20% of newly determined sequences.
read more
Abstract: An automatic procedure is proposed to identify, from the protein sequence database, conserved amino acid patterns (or sequence motifs) that are exclusive to a group of functionally related proteins. This procedure is applied to the PIR database and a dictionary of sequence motifs that relate to specific superfamilies constructed. The motifs have a practical relevance in identifying the membership of specific superfamilies without the need to perform sequence database searches in 20% of newly determined sequences. The sequence motifs identified represent functionally important sites on protein molecules. When multiple blocks exist in a single motif they are often close together in the 3-D structure. Furthermore, occasionally these motif blocks were found to be split by introns when the correlation with exon structures was examined.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
The PROSITE database, its status in 1997
TL;DR: The PROSITE database (http://www.expasy.ch/sprot/prosite.htm l) consists of biologically significant patterns and profiles formulated in such a way that with appropriate computational tools it can help to determine to which known family of protein a new sequence belongs, or which known domain(s) it contains.
The PROSITE database, its status in 1999.
TL;DR: The PROSITE database consists of biologically significant patterns and profiles formulated in such a way that with appropriate computational tools it can help to determine to which known family of protein (if any) a new sequence belongs, or which known domain(s) it contains.
Finding flexible patterns in unaligned protein sequences.
TL;DR: The method is shown to recover known motifs for PROSITE families and is also applied to some recently described families from the literature.
370
Automated construction and graphical presentation of protein blocks from unaligned sequences.
TL;DR: Blockmaker is an automated system that finds blocks in a group of related protein sequences submitted by the user that adapts and extends existing algorithms to make them useful to biologists looking for conserved regions in agroup of related proteins sequences.
351
PROSITE: recent developments
Amos Marc Bairoch,Philip Bucher +1 more
TL;DR: PROSITE is a compilation of sites and patterns found in protein sequences that can be used as a method of determining the function of uncharacterized proteins translated from genomic or cDNA sequences.
References
•Book
Atlas of protein sequence and structure
M. A. Chang,M. O. Dayhoff,R. V. Eck,M. R. Sochard +3 more
- 01 Jan 1965