A survey of motif discovery methods in an integrated framework.
Geir Kjetil Sandve,Finn Drabløs +1 more
TL;DR: A survey of methods for motif discovery in DNA, based on a structured and well defined framework that integrates all relevant elements, shows that although no single method takes allrelevant elements into consideration, a very large number of different models treating the various elements separately have been tried.
read more
Abstract: There has been a growing interest in computational discovery of regulatory elements, and a multitude of motif discovery methods have been proposed. Computational motif discovery has been used with some success in simple organisms like yeast. However, as we move to higher organisms with more complex genomes, more sensitive methods are needed. Several recent methods try to integrate additional sources of information, including microarray experiments (gene expression and ChlP-chip). There is also a growing awareness that regulatory elements work in combination, and that this combinatorial behavior must be modeled for successful motif discovery. However, the multitude of methods and approaches makes it difficult to get a good understanding of the current status of the field. This paper presents a survey of methods for motif discovery in DNA, based on a structured and well defined framework that integrates all relevant elements. Existing methods are discussed according to this framework. The survey shows that although no single method takes all relevant elements into consideration, a very large number of different models treating the various elements separately have been tried. Very often the choices that have been made are not explicitly stated, making it difficult to compare different implementations. Also, the tests that have been used are often not comparable. Therefore, a stringent framework and improved test methods are needed to evaluate the different approaches in order to conclude which ones are most promising. Reviewers: This article was reviewed by Eugene V. Koonin, Philipp Bucher (nominated by Mikhail Gelfand) and Frank Eisenhaber.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Absence of a simple code: how transcription factors read the genome
TL;DR: Structural views have been complemented with data from high-throughput in vitro and in vivo explorations of the DNA-binding preferences of many TFs to expand the understanding of TF-DNA interactions.
543
Unsupervised Pattern Discovery in Speech
Alex Park,James Glass +1 more
TL;DR: It is shown how pattern discovery can be used to automatically acquire lexical entities directly from an untranscribed audio stream by exploiting the structure of repeating patterns within the speech signal.
•Proceedings Article
Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence
Gadi Aleksandrowicz,Hana Chockler,Joseph Y. Halpern,Alexander Ivrii +3 more
- 01 Jan 2014
328
Motif discovery and transcription factor binding sites before and after the next-generation sequencing era
TL;DR: ChIP, applied to transcription factors and coupled with genome tiling arrays or next-generation sequencing technologies (ChIP-Seq) has opened new avenues in research, as well as posed new challenges to bioinformaticians developing algorithms and methods for motif discovery.
Machine-Learning Techniques
Rob Sullivan
- 01 Jan 2012
TL;DR: These two broad classifications of machine-learning methods will ground us as the authors discuss a broad range of techniques and where they are currently being applied in life sciences research, expanding their toolkit and enabling us to take a very different path in their analysis efforts.
132
References
Detecting Subtle Sequence Signals: A Gibbs Sampling Strategy for Multiple Alignment
Charles E. Lawrence,Stephen F. Altschul,Mark S. Boguski,Jun Liu,Andrew F. Neuwald,John C. Wootton +5 more
TL;DR: A mathematical definition of this "local multiple alignment" problem suitable for full computer automation has been used to develop a new and sensitive algorithm, based on the statistical method of iterative sampling, that finds an optimized local alignment model for N sequences in N-linear time, requiring only seconds on current workstations.
Genome-wide location and function of dna binding proteins
TL;DR: In this paper, a method for identifying a set of genes where cell cycle regulator binding correlates with gene expression and identifying genomic targets of cell cycle transcription activators in living cells is also encompassed.
1.9K
DNA binding sites: representation and discovery.
TL;DR: The purpose of this article is to provide a brief history of the development and application of computer algorithms for the analysis and prediction of DNA binding sites.
The Evolution of Transcriptional Regulation in Eukaryotes
Gregory A. Wray,Matthew W. Hahn,Ehab Abouheif,James P. Balhoff,Margaret Pizer,Matthew V. Rockman,Laura A. Romano +6 more
TL;DR: The evolutionary dynamics of promoter, or cis-regulatory, sequences and the evolutionary mechanisms that shape them are reviewed.
BioProspector: discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes.
X. Liu,Douglas L. Brutlag,Jun Liu +2 more
- 01 Dec 2000
TL;DR: BioProspector, a C program using a Gibbs sampling strategy, examines the upstream region of genes in the same gene expression pattern group and looks for regulatory sequence motifs, showing preliminary success in finding the binding motifs for Saccharomyces cerevisiae RAP1, Bacillus subtilis RNA polymerase, and Escherichia coli CRP.