About: Segmental duplication is a research topic. Over the lifetime, 1154 publications have been published within this topic receiving 80441 citations.
TL;DR: The current human genome sequence (Build 35) as discussed by the authors contains 2.85 billion nucleotides interrupted by only 341 gaps and is accurate to an error rate of approximately 1 event per 100,000 bases.
Abstract: The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers approximately 99% of the euchromatic genome and is accurate to an error rate of approximately 1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human genome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead.
TL;DR: A first-generation CNV map of the human genome is constructed through the study of 270 individuals from four populations with ancestry in Europe, Africa or Asia, underscoring the importance of CNV in genetic diversity and evolution and the utility of this resource for genetic disease studies.
Abstract: Copy number variation (CNV) of DNA sequences is functionally significant but has yet to be fully ascertained. We have constructed a first-generation CNV map of the human genome through the study of 270 individuals from four populations with ancestry in Europe, Africa or Asia (the HapMap collection). DNA from these individuals was screened for CNV using two complementary technologies: single-nucleotide polymorphism (SNP) genotyping arrays, and clone-based comparative genomic hybridization. A total of 1,447 copy number variable regions (CNVRs), which can encompass overlapping or adjacent gains or losses, covering 360 megabases (12% of the genome) were identified in these populations. These CNVRs contained hundreds of genes, disease loci, functional elements and segmental duplications. Notably, the CNVRs encompassed more nucleotide content per genome than SNPs, underscoring the importance of CNV in genetic diversity and evolution. The data obtained delineate linkage disequilibrium patterns for many CNVs, and reveal marked variation in copy number among populations. We also demonstrate the utility of this resource for genetic disease studies.
TL;DR: A new method to align two or more genomes that have undergone rearrangements due to recombination and substantial amounts of segmental gain and loss is described, demonstrating high accuracy in situations where genomes have undergone biologically feasible amounts of genome rearrangement, segmental loss and loss.
Abstract: Background
Multiple genome alignment remains a challenging problem. Effects of recombination including rearrangement, segmental duplication, gain, and loss can create a mosaic pattern of homology even among closely related organisms.
TL;DR: This article identified 255 loci across the human genome that contain genomic imbalances among unrelated individuals, and revealed that half of these regions overlap with genes, and many coincide with segmental duplications or gaps in human genome assembly.
Abstract: We identified 255 loci across the human genome that contain genomic imbalances among unrelated individuals. Twenty-four variants are present in > 10% of the individuals that we examined. Half of these regions overlap with genes, and many coincide with segmental duplications or gaps in the human genome assembly. This previously unappreciated heterogeneity may underlie certain human phenotypic variation and susceptibility to disease and argues for a more dynamic human genome structure.
TL;DR: Combining information about genomic segmental duplications, gene family phylogenies, and gene positions provides a method to evaluate contributions of tandem duplication and segmental genome duplication in the generation and maintenance of gene families.
Abstract: Background
Most genes in Arabidopsis thaliana are members of gene families. How do the members of gene families arise, and how are gene family copy numbers maintained? Some gene families may evolve primarily through tandem duplication and high rates of birth and death in clusters, and others through infrequent polyploidy or large-scale segmental duplications and subsequent losses.