TL;DR: Evidence is presented that single-base repeats (the shortest possible motifs) are represented by longer runs in mammalian introns than would be expected on a random basis, supporting the idea that SSM may be a ubiquitous force in the evolution of the eukaryotic genome.
Abstract: Simple repetitive DNA sequences are a widespread and abundant feature of genomic DNA. The following several features characterize such sequences: (1) they typically consist of a variety of repeated motifs of 1-10 bases--but may include much larger repeats as well; (2) larger repeat units often include shorter ones within them; (3) long polypyrimidine and poly-CA tracts are often found; and (4) tandem arrangements of closely related motifs are often found. We propose that slipped-strand mispairing events, in concert with unequal crossing-over, can readily account for all of these features. The frequent occurrence of long tandem repeats of particular motifs (polypyrimidine and poly-CA tracts) appears to result from nonrandom patterns of nucleotide substitution. We argue that the intrahelical process of slipped-strand mispairing is much more likely to be the major factor in the initial expansion of short repeated motifs and that, after initial expansion, simple tandem repeats may be predisposed to further expansion by unequal crossing-over or other interhelical events because of their propensity to mispair. Evidence is presented that single-base repeats (the shortest possible motifs) are represented by longer runs in mammalian introns than would be expected on a random basis, supporting the idea that SSM may be a ubiquitous force in the evolution of the eukaryotic genome. Simple repetitive sequences may therefore represent a natural ground state of DNA unselected for coding functions.
TL;DR: A review of the available data related to SSR distribution in coding and non-coding regions of genomes and SSR functional importance is presented in this article, where the role of two putative mutational mechanisms, replication slippage and recombination, and their interaction in SSR variation is discussed.
Abstract: Microsatellites, or tandem simple sequence repeats (SSR), are abundant across genomes and show high levels of polymorphism. SSR genetic and evolutionary mechanisms remain controversial. Here we attempt to summarize the available data related to SSR distribution in coding and noncoding regions of genomes and SSR functional importance. Numerous lines of evidence demonstrate that SSR genomic distribution is nonrandom. Random expansions or contractions appear to be selected against for at least part of SSR loci, presumably because of their effect on chromatin organization, regulation of gene activity, recombination, DNA replication, cell cycle, mismatch repair system, etc. This review also discusses the role of two putative mutational mechanisms, replication slippage and recombination, and their interaction in SSR variation.
TL;DR: SSRs within genes evolve through mutational processes similar to those for SSRs located in other genomic regions including replication slippage, point mutation, and recombination and may provide a molecular basis for fast adaptation to environmental changes in both prokaryotes and eukaryotes.
Abstract: Recently, increasingly more microsatellites, or simple sequence repeats (SSRs) have been found and characterized within protein-coding genes and their untranslated regions (UTRs). These data provide useful information to study possible SSR functions. Here, we review SSR distributions within expressed sequence tags (ESTs) and genes including protein-coding, 3'-UTRs and 5'-UTRs, and introns; and discuss the consequences of SSR repeat-number changes in those regions of both prokaryotes and eukaryotes. Strong evidence shows that SSRs are nonrandomly distributed across protein-coding regions, UTRs, and introns. Substantial data indicates that SSR expansions and/or contractions in protein-coding regions can lead to a gain or loss of gene function via frameshift mutation or expanded toxic mRNA. SSR variations in 5'-UTRs could regulate gene expression by affecting transcription and translation. The SSR expansions in the 3'-UTRs cause transcription slippage and produce expanded mRNA, which can be accumulated as nuclear foci, and which can disrupt splicing and, possibly, disrupt other cellular function. Intronic SSRs can affect gene transcription, mRNA splicing, or export to cytoplasm. Triplet SSRs located in the UTRs or intron can also induce heterochromatin-mediated-like gene silencing. All these effects caused by SSR expansions or contractions within genes can eventually lead to phenotypic changes. SSRs within genes evolve through mutational processes similar to those for SSRs located in other genomic regions including replication slippage, point mutation, and recombination. These mutational processes generate DNA changes that should be connected by DNA mismatch repair (MMR) system. Mutation that has escaped from the MMR system correction would become new alleles at the SSR loci, and then regulate and/or change gene products, and eventually lead to phenotype changes. Therefore, SSRs within genes should be subjected to stronger selective pressure than other genomic regions because of their functional importance. These SSRs may provide a molecular basis for fast adaptation to environmental changes in both prokaryotes and eukaryotes.
TL;DR: The recent developments in plant genetics using SSR markers are discussed and a quantum of literature has accumulated regarding the applicability of SSR based techniques.
Abstract: In recent years, molecular markers have been utilized for a variety of applications including examination of genetic relationships between individuals, mapping of useful genes, construction of linkage maps, marker assisted selections and backcrosses, population genetics and phylogenetic studies. Among the available molecular markers, microsatellites or simple sequence repeats (SSRs) which are tandem repeats of one to six nucleotide long DNA motifs, have gained considerable importance in plant genetics and breeding owing to many desirable genetic attributes including hypervariability, multiallelic nature, codominant inheritance, reproducibility, relative abundance, extensive genome coverage including organellar genomes, chromosome specific location and amenability to automation and high throughput genotyping. High degree of allelic variation revealed by microsatellite markers results from variation in number of repeat-motifs at a locus caused by replication slippage and/or unequal crossing-over during meiosis. In spite of limited understanding of the functions of the SSR motifs within the plant genes, SSRs are being widely utilized in plant genome analysis. Microsatellites can be developed directly from genomic DNA libraries or from libraries enriched for specific microsatellites. Alternatively, microsatellites can also be found by searching public databases such as GenBank and EMBL or through cross-species transferability. At present, EST databases are an important source of candidate genes, as these can generate markers directly associated with a trait of interest and may be transferable in close relative genera. A large number of SSR based techniques have been developed and a quantum of literature has accumulated regarding the applicability of SSRs in plant genetics and genomics. In this review we discuss the recent developments (last 4–5 years) made in plant genetics using SSR markers.
TL;DR: The findings suggest that the differences between coding and noncoding microsatellite frequencies arise from specific selection against frameshift mutations in coding regions resulting from length changes in nontriplet repeats.
Abstract: Microsatellite enrichment is an excess of repetitive sequences characteristic to all studied eukaryotes. It is thought to result from the accumulated effects of replication slippage mutations. Enrichment is commonly measured as the ratio of the observed frequency of microsatellites to the frequency expected to result from random association of nucleotides. We have compared enrichment of specific types of microsatellites in coding sequences with those in noncoding sequences across seven eukaryotic clades. The results reveal consistent differences between coding and noncoding regions, in terms of both the quantity of repetitive DNA and the types present. In noncoding regions, all types of microsatellite (mono-, di-, tri-, tetra-, penta-, and hexanucleotide repeats) are found in excess, and in all cases, these excesses scale in a similar exponential fashion with the length of the microsatellite. This suggests that all types of noncoding repeats are subject to similar mutational and selective processes. Coding repeats, however, appear to be under much stronger and more specific constraints. Tri- and hexanucleotide repeats are found in consistent and significant excess over a wide range of lengths in both coding and noncoding sequences, but other repeat types are much less frequent in coding regions than in noncoding regions. These findings suggest that the differences between coding and noncoding microsatellite frequencies arise from specific selection against frameshift mutations in coding regions resulting from length changes in nontriplet repeats. Furthermore, the excesses of tri- and hexanucleotide coding repeats appear to be controlled primarily by mutation pressure.