Simultaneous structural variation discovery among multiple paired-end sequenced genomes
TL;DR: This study introduces the maximum parsimony-based simultaneous structural variation discovery problem for a set of high-throughput sequenced genomes and provides efficient algorithms to solve it.
read more
Abstract: With the increasing popularity of whole-genome shotgun sequencing (WGSS) via high-throughput sequencing technologies, it is becoming highly desirable to perform comparative studies involving multiple individuals (from a specific population, race, or a group sharing a particular phenotype). The conventional approach for a comparative genome variation study involves two key steps: (1) each paired-end high-throughput sequenced genome is compared with a reference genome and its (structural) differences are identified; (2) the lists of structural variants in each genome are compared against each other. In this study we propose to move away from this two-step approach to a novel one in which all genomes are compared with the reference genome simultaneously for obtaining much higher accuracy in structural variation detection. For this purpose, we introduce the maximum parsimony-based simultaneous structural variation discovery problem for a set of high-throughput sequenced genomes and provide efficient algorithms to solve it. We compare the proposed framework with the conventional framework, on the genomes of the Yoruban mother-father-child trio, as well as the CEU trio of European ancestry (both sequenced by Illumina platforms). We observed that the conventional framework predicts an unexpectedly high number of de novo variations in the child in comparison to the parents and misses some of the known variations. Our proposed framework, on the other hand, not only significantly reduces the number of incorrectly predicted de novo variations but also predicts more of the known (true) variations.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Genome structural variation discovery and genotyping
TL;DR: It is argued that the long-term goal should be routine, cost-effective and high quality de novo assembly of human genomes to comprehensively assess all classes of structural variation.
LUMPY: a probabilistic framework for structural variant discovery
TL;DR: It is shown that LUMPY yields improved sensitivity, especially when SV signal is reduced owing to either low coverage data or low intra-sample variant allele frequency, as well as a set of 4,564 validated breakpoints from the NA12878 human genome.
•Posted Content
LUMPY: A probabilistic framework for structural variant discovery
TL;DR: In this paper, a probabilistic structural variation (SV) discovery framework was proposed that is capable of integrating any number of SV detection signals including those generated from read alignments or prior evidence.
1.1K
Computational tools for copy number variation (CNV) detection using next-generation sequencing data: features and perspectives
TL;DR: The recent advances in computational methods pertaining to CNV detection using whole genome and whole exome sequencing data are reviewed to discuss their strengths and weaknesses and suggest directions for future development.
Genomic Patterns of De Novo Mutation in Simplex Autism
Tychele N. Turner,Bradley P. Coe,Diane E. Dickel,Kendra Hoekzema,Bradley J. Nelson,Michael C. Zody,Zev N. Kronenberg,Fereydoun Hormozdiari,Archana Raja,Len A. Pennacchio,Robert B. Darnell,Evan E. Eichler +11 more
TL;DR: Patients are more likely to carry multiple coding and noncoding DNMs in different genes, which are enriched for expression in striatal neurons, suggesting a path forward for genetically characterizing more complex cases of autism.
357
References
Fast and accurate short read alignment with Burrows–Wheeler transform
Heng Li,Richard Durbin +1 more
TL;DR: Burrows-Wheeler Alignment tool (BWA) is implemented, a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps.
Reducibility Among Combinatorial Problems.
Richard M. Karp
- 01 Jan 1972
TL;DR: Throughout the 1960s I worked on combinatorial optimization problems including logic circuit design with Paul Roth and assembly line balancing and the traveling salesman problem with Mike Held, which made me aware of the importance of distinction between polynomial-time and superpolynomial-time solvability.
13.6K
Velvet: Algorithms for de novo short read assembly using de Bruijn graphs
Daniel R. Zerbino,Ewan Birney +1 more
TL;DR: Velvet represents a new approach to assembly that can leverage very short reads in combination with read pairs to produce useful assemblies and is in close agreement with simulated results without read-pair information.
10.2K
Reducibility Among Combinatorial Problems
TL;DR: The work of Dantzig, Fulkerson, Hoffman, Edmonds, Lawler and other pioneers on network flows, matching and matroids acquainted me with the elegant and efficient algorithms that were sometimes possible.
8.7K
A Map of Human Genome Variation From Population-Scale Sequencing
Gonçalo R. Abecasis,David Altshuler,David Altshuler,Adam Auton,Lisa D Brooks,Richard Durbin,Richard A. Gibbs,Matthew E. Hurles,Gil McVean +8 more
TL;DR: The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype as mentioned in this paper, and the results of the pilot phase of the project, designed to develop and compare different strategies for genomewide sequencing with high-throughput platforms.
8.2K