Parallel processing in biological sequence comparison using general purpose processors
Friman Sánchez,Esther Salamí,Alex Ramirez,Mateo Valero +3 more
- 07 Nov 2005
- pp 99-108
TL;DR: This paper proposes a more efficient data parallel implementation of the Smith-Waterman algorithm, and obtains a 30% reduction in the execution time relative to the previous best data-parallel alternative.
read more
Abstract: The comparison and alignment of DNA and protein sequences are important tasks in molecular biology and bioinformatics. One of the most well known algorithms to perform the string-matching operation present in these tasks is the Smith-Waterman algorithm (SW). However, it is a computation intensive algorithm, and many researchers have developed heuristic strategies to avoid using it, specially when using large databases to perform the search. There are several efficient implementations of the SW algorithm on general purpose processors. These implementations try to extract data-level parallelism taking advantage of single-instruction multiple-data extensions (SIMD), capable of performing several operations in parallel on a set of data. In this paper, we propose a more efficient data parallel implementation of the SW algorithm. Our proposed implementation obtains a 30% reduction in the execution time relative to the previous best data-parallel alternative. In this paper we review different alternative implementation of the SW algorithm, compare them with our proposal, and present preliminary results for some heuristic implementations. Finally, we present a detailed study of the computational complexity of the different alignment algorithms presented and their behavior on the different aspect of the CPU microarchitecture.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
A Review of SIMD Multimedia Extensions and their Usage in Scientific and Engineering Applications
TL;DR: An overview of SIMD multimedia extensions is given, which reviews recent trends to use multimedia extensions to accelerate many applications such as multimedia, scientific and engineering applications, and argues for further use in other significant computationally intensive applications.
54
An Architectural Characterization Study of Data Mining and Bioinformatics Workloads
Berkin Ozisikyilmaz,Ramanathan Narayanan,Joseph Zambreno,Gokhan Memik,Alok Choudhary +4 more
- 25 Oct 2006
TL;DR: MineBench is presented, a publicly available benchmark suite containing fifteen representative data mining applications belonging to various categories: classification, clustering, association rule mining and optimization and evaluated on an 8-way shared memory (SMP) machine.
Performance Analysis of Sequence Alignment Applications
Friman Sánchez,Esther Salamí,Alex Ramirez,Mateo Valero +3 more
- 01 Oct 2006
TL;DR: In this article, a micro-architecture performance analysis of recognized bioinformatic applications for the comparison and alignment of biological sequences, including BLAST, FASTA and some recognized parallel implementations of the Smith-Waterman algorithm that use the Altivec SIMD extension to speed up the performance.
Performance enhancement of smith-waterman algorithm using hybrid model: Comparing the MPI and hybrid programming paradigm on SMP clusters
Mahdi Noorian,Hamidreza Pooshfam,Zeinab Noorian,Rosni Abdullah +3 more
- 11 Oct 2009
TL;DR: In this paper, Smith-Waterman algorithm is parallelized base on various types of parallel programming, pure MPI, pure OpenMP and Hybrid MPI-OpenMP model and it will be proved that hybrid programming which employ the coarse grain and fine grain parallelization, is more efficient compare withpure MPI andpure OpenMP in cluster of SMP machines.
A Novel Protein Sequence Alignment-Based Patch Similarity Estimation for Two-Level Data Aggregation in WMSNs
M. Nava Barathy,Dharma Dejey +1 more
TL;DR: In this paper, a distributed two-layer cluster framework is used for transmitting aggregated data by local cluster head (LCH) and master cluster head from various clusters to base station in multi-hop basis.
5
References
Basic Local Alignment Search Tool
TL;DR: A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score.
98.8K
Improved tools for biological sequence comparison.
TL;DR: Three computer programs for comparisons of protein and DNA sequences can be used to search sequence data bases, evaluate similarity scores, and identify periodic structures based on local sequence similarity.
13.3K
A general method applicable to the search for similarities in the amino acid sequence of two proteins
TL;DR: A computer adaptable method for finding similarities in the amino acid sequences of two proteins has been developed and it is possible to determine whether significant homology exists between the proteins to trace their possible evolutionary development.
13.2K
Identification of common molecular subsequences.
TL;DR: This letter extends the heuristic homology algorithm of Needleman & Wunsch (1970) to find a pair of segments, one from each of two long sequences, such that there is no other Pair of segments with greater similarity (homology).
11.3K
Amino acid substitution matrices from protein blocks
TL;DR: This work has derived substitution matrices from about 2000 blocks of aligned sequence segments characterizing more than 500 groups of related proteins, leading to marked improvements in alignments and in searches using queries from each of the groups.
7.2K