About: Variable number tandem repeat is a research topic. Over the lifetime, 2503 publications have been published within this topic receiving 95053 citations. The topic is also known as: VNTR.
TL;DR: A new algorithm for finding tandem repeats which works without the need to specify either the pattern or pattern size is presented and its ability to detect tandem repeats that have undergone extensive mutational change is demonstrated.
Abstract: A tandem repeat in DNA is two or more contiguous, approximate copies of a pattern of nucleotides. Tandem repeats have been shown to cause human disease, may play a variety of regulatory and evolutionary roles and are important laboratory and analytic tools. Extensive knowledge about pattern size, copy number, mutational history, etc. for tandem repeats has been limited by the inability to easily detect them in genomic sequence data. In this paper, we present a new algorithm for finding tandem repeats which works without the need to specify either the pattern or pattern size. We model tandem repeats by percent identity and frequency of indels between adjacent pattern copies and use statistically based recognition criteria. We demonstrate the algorithm’s speed and its ability to detect tandem repeats that have undergone extensive mutational change by analyzing four sequences: the human frataxin gene, the human β T cell receptor locus sequence and two yeast chromosomes. These sequences range in size from 3 kb up to 700 kb. A World Wide Web server interface at c3.biomath.mssm.edu/trf.html has been established for automated use of the program.
TL;DR: A new method for de novo identification of repeat families via extension of consensus seeds is developed, which enables a rigorous definition of repeat boundaries, a key issue in repeat analysis.
Abstract: Every time we compare two species that are closer to each other than either is to humans, we get nearly killed by unmasked repeats.
Webb Miller (Personal communication)
Motivation:De novo repeat family identification is a challenging algorithmic problem of great practical importance. As the number of genome sequencing projects increases, there is a pressing need to identify the repeat families present in large, newly sequenced genomes. We develop a new method for de novo identification of repeat families via extension of consensus seeds; our method enables a rigorous definition of repeat boundaries, a key issue in repeat analysis.
Results: Our RepeatScout algorithm is more sensitive and is orders of magnitude faster than RECON, the dominant tool for de novo repeat family identification in newly sequenced genomes. Using RepeatScout, we estimate that ∼2% of the human genome and 4% of mouse and rat genomes consist of previously unannotated repetitive sequence.
Availability: Source code is available for download at http://www-cse.ucsd.edu/groups/bioinformatics/software.html
Contact: ppevzner@cs.ucsd.edu
TL;DR: A discriminatory subset of 15 loci with the highest evolutionary rates was defined that concentrated 96% of the total resolution obtained with the full 24-locus set, and its predictive value for evaluating M. tuberculosis transmission was found to be equal to that of IS6110 restriction fragment length polymorphism typing.
Abstract: Molecular typing based on 12 loci containing variable numbers of tandem repeats of mycobacterial interspersed repetitive units (MIRU-VNTRs) has been adopted in combination with spoligotyping as the basis for large-scale, high-throughput genotyping of Mycobacterium tuberculosis. However, even the combination of these two methods is still less discriminatory than IS6110 fingerprinting. Here, we define an optimized set of MIRU-VNTR loci with a significantly higher discriminatory power. The resolution and the stability/robustness of 29 loci were analyzed, using a total of 824 tubercle bacillus isolates, including representatives of the main lineages identified worldwide so far. Five loci were excluded for lack of robustness and/or stability in serial isolates or isolates from epidemiologically linked patients. The use of the 24 remaining loci increased the number of types by 40%—and by 23% in combination with spoligotyping—among isolates from cosmopolitan origins, compared to those obtained with the original set of 12 loci. Consequently, the clustering rate was decreased by fourfold—by threefold in combination with spoligotyping—under the same conditions. A discriminatory subset of 15 loci with the highest evolutionary rates was then defined that concentrated 96% of the total resolution obtained with the full 24-locus set. Its predictive value for evaluating M. tuberculosis transmission was found to be equal to that of IS6110 restriction fragment length polymorphism typing, as shown in a companion population-based study. This 15-locus system is therefore proposed as the new standard for routine epidemiological discrimination of M. tuberculosis isolates and the 24-locus system as a high-resolution tool for phylogenetic studies.
TL;DR: It is shown that this technique can be used for forensic purposes; DNA of high relative molecular mass (Mr) can be isolated from 4-yr-old bloodstains and semen stains made on cotton cloth and digested to produce DNA fingerprints suitable for individual identification.
Abstract: Many highly polymorphic minisatellite loci can be detected simultaneously in the human genome by hybridization to probes consisting of tandem repeats of the 'core' sequence. The resulting DNA fingerprints produced by Southern blot hybridization are comprised of multiple hypervariable DNA fragments, show somatic and germline stability and are completely specific to an individual. We now show that this technique can be used for forensic purposes; DNA of high relative molecular mass (Mr) can be isolated from 4-yr-old bloodstains and semen stains made on cotton cloth and digested to produce DNA fingerprints suitable for individual identification. Further, sperm nuclei can be separated from vaginal cellular debris, obtained from semen-contaminated vaginal swabs, enabling positive identification of the male donor/suspect. It is envisaged that DNA fingerprinting will revolutionize forensic biology particularly with regard to the identification of rape suspects.
TL;DR: The full sequence for PEM, as deduced from cDNA sequences, is reported, with length variations in the tandem repeat result in PEM being an expressed variable number tandem repeat locus.