TL;DR: A large-scale project to characterize copy-number alterations in primary lung adenocarcinomas using dense single nucleotide polymorphism arrays identifies NKX2-1 (NK2 homeobox 1, also called TITF1), which lies in the minimal 14q13.3 amplification interval and encodes a lineage-specific transcription factor, as a novel candidate proto-oncogene involved in a significant fraction of lung carcinomas.
Abstract: Somatic alterations in cellular DNA underlie almost all human cancers 1 . The prospect of targeted therapies 2 and the development of high-resolution, genome-wide approaches 3–8 are now spurring systematic efforts to characterize cancer genomes. Here we report a large-scale project to characterize copy-number alterations in primary lung adenocarcinomas. By analysis of a large collection oftumours(n 5371)usingdensesinglenucleotidepolymorphism arrays, we identify a total of 57 significantly recurrent events. We find that 26 of 39 autosomal chromosome arms show consistent large-scalecopy-numbergainorloss,ofwhichonlyahandfulhave been linked to a specific gene. We also identify 31 recurrent focal events, including 24 amplifications and 7 homozygous deletions. Only six of these focal events are currently associated with known mutations in lung carcinomas. The most common event, amplification of chromosome 14q13.3, is found in 12% of samples. On the basis of genomic and functional analyses, we identify NKX2-1 (NK2 homeobox 1, also called TITF1), which lies in the minimal 14q13.3 amplification interval and encodes a lineagespecific transcription factor, as a novel candidate proto-oncogene involved in a significant fraction of lung adenocarcinomas. More generally, our results indicate that many of the genes that are involved in lung adenocarcinoma remain to be discovered. A collection of 528 snap-frozen lung adenocarcinoma resection specimens, with at least 70% estimated tumour content, was selected by a panel of thoracic pathologists (Supplementary Table 1); samples were anonymized to protect patient privacy. Tumour and normal DNAs were hybridized to Affymetrix 250K Sty single nucleotide polymorphism (SNP)arrays. Genomic copy number foreach ofover 238,000 probe sets was determined by calculating the intensity ratio between the tumour DNA and the average of a set of normal DNAs 9,10 . Segmented copy numbers for each tumour were inferred with the GLAD (gain and loss analysis of DNA) algorithm 11 and normalized to a median of two copies. Each copy number profile was then subjected to quality control, resulting in 371 high-quality samples used for further analysis, of which 242 had matched normal
TL;DR: There is compelling evidence that changes in DNA sequence, cis- and trans-acting effects, chromatin modifications, RNA-mediated pathways, and regulatory networks modulate differential expression of homoeologous genes and phenotypic variation that may facilitate adaptive evolution in polyploid plants and domestication in crops.
Abstract: Polyploidy, or whole-genome duplication (WGD), is an important genomic feature for all eukaryotes, especially many plants and some animals. The common occurrence of polyploidy suggests an evolutionary advantage of having multiple sets of genetic material for adaptive evolution. However, increased gene and genome dosages in autopolyploids (duplications of a single genome) and allopolyploids (combinations of two or more divergent genomes) often cause genome instabilities, chromosome imbalances, regulatory incompatibilities, and reproductive failures. Therefore, new allopolyploids must establish a compatible relationship between alien cytoplasm and nuclei and between two divergent genomes, leading to rapid changes in genome structure, gene expression, and developmental traits such as fertility, inbreeding, apomixis, flowering time, and hybrid vigor. Although the underlying mechanisms for these changes are poorly understood, some themes are emerging. There is compelling evidence that changes in DNA se...
TL;DR: In the unicellular eukaryote Paramecium tetraurelia, a ciliate, most of the nearly 40,000 genes arose through at least three successive whole-genome duplications as mentioned in this paper.
Abstract: The duplication of entire genomes has long been recognized as having great potential for evolutionary novelties, but the mechanisms underlying their resolution through gene loss are poorly understood. Here we show that in the unicellular eukaryote Paramecium tetraurelia, a ciliate, most of the nearly 40,000 genes arose through at least three successive whole-genome duplications. Phylogenetic analysis indicates that the most recent duplication coincides with an explosion of speciation events that gave rise to the P. aurelia complex of 15 sibling species. We observed that gene loss occurs over a long timescale, not as an initial massive event. Genes from the same metabolic pathway or protein complex have common patterns of gene loss, and highly expressed genes are over-retained after all duplications. The conclusion of this analysis is that many genes are maintained after whole-genome duplication not because of functional innovation but because of gene dosage constraints. Ciliates are unique among unicellular organisms in that they separate germline and somatic functions 1. Each cell harbours two kinds of nucleus, namely silent diploid micronuclei and highly polyploid macronuclei. The latter are unusual in that they contain an exten sively rearranged genome streamlined for expression and divide by a non-mitotic process. Only micronuclei undergo meiosis to perpetu ate genetic information; the macronuclei are lost at each sexual gen eration and develop anew from the micronuclear lineage. In Paramecillm the exact number of micronuclear chromosomes (more than 50) and the structures oftheir centromeres and telomeres remain unknown. During macronuclear development, these chro mosomes are amplified to about 800 copies and undergo two types of DNA elimination event. Tens of thousand of short, unique copy elements (internal eliminated sequences) are removed by a precise mechanism that leads to the reconstitution of functional genes 2 • Transposable elements and other repeated sequences are removed by an imprecise mechanism leading either to chromosome frag mentation and de novo telomere addition or to variable internal deletions'. These rearrangements occur after a few rounds of endoreplication, leading to some heterogeneity in the sequences abutting the imprecisely eliminated regions'. The sizes of the result ing' acentric macronuclear chromosomes range from 50-1,000 kilobases (kb) as measured by pulsed-field gel electrophoresis. Because the sexual process of autogamy results in an entirely homozygous genotype', the macronuclear DNA that was sequenced was genetic ally homogeneous.
TL;DR: It is suggested that duplication of the ancestral bifunctional gene allowed for the resolution of an adaptive conflict between the transcriptional regulation of the two gene functions and became one of the most tightly regulated genes in the genome.
Abstract: How gene duplication and divergence contribute to genetic novelty and adaptation has been of intense interest, but experimental evidence has been limited. The genetic switch controlling the yeast galactose use pathway includes two paralogous genes in Saccharomyces cerevisiae that encode a co-inducer (GAL3) and a galactokinase (GAL1). These paralogues arose from a single bifunctional ancestral gene as is still present in Kluyveromyces lactis. To determine which evolutionary processes shaped the evolution of the two paralogues, here we assess the effects of precise replacement of coding and non-coding sequences on organismal fitness. We suggest that duplication of the ancestral bifunctional gene allowed for the resolution of an adaptive conflict between the transcriptional regulation of the two gene functions. After duplication, previously disfavoured binding site configurations evolved that divided the regulation of the ancestral gene into two specialized genes, one of which ultimately became one of the most tightly regulated genes in the genome.
TL;DR: This paper aims to provide a history of investigations and a test case for the theory that barriers to Gene Flow and Gene Duplication in humans and Associated Lineages are connected.
Abstract: Preface 1. History of Investigations 2. The Role of Species Concepts 3. Testing the Hypothesis 4. Barriers to Gene Flow 5. Hybrid Fitness 6. Gene Duplication 7. Origin of New Evolutionary Lineages 8. Implications For Endangered Taxa 9. Humans and Associated Lineages 10. Emergent Properties Glossary Reference Index
TL;DR: Comp comparative analyses are revealing the evolutionary processes that occur as multiple related genomes diverge from a shared polyploid ancestor, and in individual genomes that underwent several successive rounds of duplication.
TL;DR: This study shows that plant SKP1 genes have evolved by multiple duplication events from a single ancestral copy in the most recent common ancestor (MRCA) of eudicots and monocots, and proposes that two and three ancient retroposition events occurred in lineages leading to Arabidopsis and rice, followed by repeated tandem duplications and chromosome rearrangements.
Abstract: Gene duplication plays important roles in organismal evolution, because duplicate genes provide raw materials for the evolution of mechanisms controlling physiological and/or morphological novelties. Gene duplication can occur via several mechanisms, including segmental duplication, tandem duplication and retroposition. Although segmental and tandem duplications have been found to be important for the expansion of a number of multigene families, the contribution of retroposition is not clear. Here we show that plant SKP1 genes have evolved by multiple duplication events from a single ancestral copy in the most recent common ancestor (MRCA) of eudicots and monocots, resulting in 19 ASK (Arabidopsis SKP1-like) and 28 OSK (Oryza SKP1-like) genes. The estimated birth rates are more than ten times the average rate of gene duplication, and are even higher than that of other rapidly duplicating plant genes, such as type I MADS box genes, R genes, and genes encoding receptor-like kinases. Further analyses suggest that a relatively large proportion of the duplication events may be explained by tandem duplication, but few, if any, are likely to be due to segmental duplication. In addition, by mapping the gain/loss of a specific intron on gene phylogenies, and by searching for the features that characterize retrogenes/retrosequences, we show that retroposition is an important mechanism for expansion of the plant SKP1 gene family. Specifically, we propose that two and three ancient retroposition events occurred in lineages leading to Arabidopsis and rice, respectively, followed by repeated tandem duplications and chromosome rearrangements. Our study represents a thorough investigation showing that retroposition can play an important role in the evolution of a plant gene family whose members do not encode mobile elements.
TL;DR: A model is described by which selection continuously favors both maintenance of the duplicate copy and divergence of that copy from the parent gene, which would restrict the freedom to diverge.
Abstract: New genes with novel functions arise by duplication and divergence, but the process poses a problem. After duplication, an extra gene copy must rise to sufficiently high frequency in the population and remain free of common inactivating lesions long enough to acquire the rare mutations that provide a new selectable function. Maintaining a duplicated gene by selection for the original function would restrict the freedom to diverge. (We refer to this problem as Ohno's dilemma). A model is described by which selection continuously favors both maintenance of the duplicate copy and divergence of that copy from the parent gene. Before duplication, the original gene has a trace side activity (the innovation) in addition to its original function. When an altered ecological niche makes the minor innovation valuable, selection favors increases in its level (the amplification), which is most frequently conferred by increased dosage of the parent gene. Selection for the amplified minor function maintains the extra copies and raises the frequency of the amplification in the population. The same selection favors mutational improvement of any of the extra copies, which are not constrained to maintain their original function (the divergence). The rate of mutations (per genome) that improve the new function is increased by the multiplicity of target copies within a genome. Improvement of some copies relaxes selection on others and allows their loss by mutation (becoming pseudogenes). Ultimately one of the extra copies is able to provide all of the new activity.
TL;DR: The results demonstrate that the apparent stasis in total gene number among species has masked rapid turnover in individual gene gain and loss, and it is likely that this genomic revolving door has played a large role in shaping the morphological, physiological, and metabolic differences among species.
Abstract: Comparison of whole genomes has revealed large and frequent changes in the size of gene families. These changes occur because of high rates of both gene gain (via duplication) and loss (via deletion or pseudogenization), as well as the evolution of entirely new genes. Here we use the genomes of 12 fully sequenced Drosophila species to study the gain and loss of genes at unprecedented resolution. We find large numbers of both gains and losses, with over 40% of all gene families differing in size among the Drosophila. Approximately 17 genes are estimated to be duplicated and fixed in a genome every million years, a rate on par with that previously found in both yeast and mammals. We find many instances of extreme expansions or contractions in the size of gene families, including the expansion of several sex- and spermatogenesis-related families in D. melanogaster that also evolve under positive selection at the nucleotide level. Newly evolved gene families in our dataset are associated with a class of testes-expressed genes known to have evolved de novo in a number of cases. Gene family comparisons also allow us to identify a number of annotated D. melanogaster genes that are unlikely to encode functional proteins, as well as to identify dozens of previously unannotated D. melanogaster genes with conserved homologs in the other Drosophila. Taken together, our results demonstrate that the apparent stasis in total gene number among species has masked rapid turnover in individual gene gain and loss. It is likely that this genomic revolving door has played a large role in shaping the morphological, physiological, and metabolic differences among species.
TL;DR: In this paper, a review explains sequence-based microbial classification, with emphasis on relating the complex world of microbial taxonomy to a clinical context, and discusses a rational approach to broad-range bacterial polymerase chain reaction and gene sequencing when applied directly to clinical samples.
Abstract: Gene amplification and sequencing have led to the discovery of new pathogens as agents of disease and have enabled us to better classify microorganisms from culture. Sequence-based identification of bacteria and fungi using culture is more objective and accurate than conventional methods, especially for classifying unusual microorganisms that are emerging pathogens in immunocompromised hosts. Although a powerful tool, the interpretation of sequence-based classification can be challenging as microbial taxonomy grows more complex, without known clinical correlatives. Additionally, broad-range gene polymerase chain reaction and sequencing have emerged as alternative, culture-independent methods for detecting pathogens from clinical material. The promise of this technique has remained strong, limited mainly by contamination and inadequate sensitivity issues. This review explains sequence-based microbial classification, with emphasis on relating the complex world of microbial taxonomy to a clinical context. Additionally, this review discusses a rational approach to broad-range bacterial polymerase chain reaction and gene sequencing when applied directly to clinical samples.
TL;DR: Molecular phylogenetic analyses suggest that both active chitinases (chitotriosidase and AMCase) result from an early gene duplication event, and substantial gene specialization has occurred in time, allowing for tissue-specific expression of pH optimized chit inases and chi-lectins.
Abstract: Family 18 of glycosyl hydrolases encompasses chitinases and so-called chi-lectins lacking enzymatic activity due to amino acid substitutions in their active site. Both types of proteins widely occur in mammals although these organisms lack endogenous chitin. Their physiological function(s) as well as evolutionary relationships are still largely enigmatic. An overview of all family members is presented and their relationships are described. Molecular phylogenetic analyses suggest that both active chitinases (chitotriosidase and AMCase) result from an early gene duplication event. Further duplication events, followed by mutations leading to loss of chitinase activity, allowed evolution of the chi-lectins. The homologous genes encoding chitinase(-like) proteins are clustered in two distinct loci that display a high degree of synteny among mammals. Despite the shared chromosomal location and high homology, individual genes have evolved independently. Orthologs are more closely related than paralogues, and calculated substitution rate ratios indicate that protein-coding sequences underwent purifying selection. Substantial gene specialization has occurred in time, allowing for tissue-specific expression of pH optimized chitinases and chi-lectins. Finally, several family 18 chitinase-like proteins are present only in certain lineages of mammals, exemplifying recent evolutionary events in the chitinase protein family.
TL;DR: The results identify duplication of MYB as an oncogenic event and suggest that MYB could be a therapeutic target in human T-ALL.
Abstract: We identified a duplication of the MYB oncogene in 8.4% of individuals with T cell acute lymphoblastic leukemia (T-ALL) and in five T-ALL cell lines. The duplication is associated with a threefold increase in MYB expression, and knockdown of MYB expression initiates T cell differentiation. Our results identify duplication of MYB as an oncogenic event and suggest that MYB could be a therapeutic target in human T-ALL.
TL;DR: The first example of a recurrent genomic disorder associated with diabetes is described, identified in a fetus with multicystic dysplastic kidneys, and the reciprocal duplication appears to be enriched in samples from patients with epilepsy.
Abstract: Most studies of genomic disorders have focused on patients with cognitive disability and/or peripheral nervous system defects. In an effort to broaden the phenotypic spectrum of this disease model, we assessed 155 autopsy samples from fetuses with well-defined developmental pathologies in regions predisposed to recurrent rearrangement, by array-based comparative genomic hybridization. We found that 6% of fetal material showed evidence of microdeletion or microduplication, including three independent events that likely resulted from unequal crossing-over between segmental duplications. One of the microdeletions, identified in a fetus with multicystic dysplastic kidneys, encompasses the TCF2 gene on 17q12, previously shown to be mutated in maturity-onset diabetes, as well as in a subset of pediatric renal abnormalities. Fine-scale mapping of the breakpoints in different patient cohorts revealed a recurrent 1.5-Mb de novo deletion in individuals with phenotypes that ranged from congenital renal abnormalities to maturity-onset diabetes of the young type 5. We also identified the reciprocal duplication, which appears to be enriched in samples from patients with epilepsy. We describe the first example of a recurrent genomic disorder associated with diabetes.
TL;DR: Within many of the duplicated pairs, 1 gene is expressed at a higher level across all assayed conditions, which suggests that the subfunctionalization model for duplicate gene preservation provides, at best, only a partial explanation for the patterns of expression divergence between duplicated genes.
Abstract: New genes may arise through tandem duplication, dispersed small-scale duplication, and polyploidy, and patterns of divergence between duplicated genes may vary among these classes. We have examined patterns of gene expression and coding sequence divergence between duplicated genes in Arabidopsis thaliana. Due to the simultaneous origin of polyploidy-derived gene pairs, we can compare covariation in the rates of expression divergence and sequence divergence within this group. Among tandem and dispersed duplicates, much of the divergence in expression profile appears to occur at or shortly after duplication. Contrary to findings from other eukaryotic systems, there is little relationship between expression divergence and synonymous substitutions, whereas there is a strong positive relationship between expression divergence and nonsynonymous substitutions. Because this pattern is pronounced among the polyploidy-derived pairs, we infer that the strength of purifying selection acting on protein sequence and expression pattern is correlated. The polyploidy-derived pairs are somewhat atypical in that they have broader expression patterns and are expressed at higher levels, suggesting differences among polyploidy- and nonpolyploidy-derived duplicates in the types of genes that revert to single copy. Finally, within many of the duplicated pairs, 1 gene is expressed at a higher level across all assayed conditions, which suggests that the subfunctionalization model for duplicate gene preservation provides, at best, only a partial explanation for the patterns of expression divergence between duplicated genes.
TL;DR: It is speculated that Maverick elements represent an evolutionary missing link between seemingly disparate invasive DNA elements that include bacteriophages, adenoviruses and eukaryotic linear plasmids.
TL;DR: Data show that increases in genomic complexity can lead to phenotypic complexity (venom composition) and that positive Darwinian selection is a common evolutionary force in snake venoms and regions identified on the surface of phospholipase A2 enzymes are potential candidate sites for structure based antivenin design.
Abstract: Gene duplication followed by functional divergence has long been hypothesized to be the main source of molecular novelty. Convincing examples of neofunctionalization, however, remain rare. Snake venom phospholipase A2 genes are members of large multigene families with many diverse functions, thus they are excellent models to study the emergence of novel functions after gene duplications. Here, I show that positive Darwinian selection and neofunctionalization is common in snake venom phospholipase A2 genes. The pattern of gene duplication and positive selection indicates that adaptive molecular evolution occurs immediately after duplication events as novel functions emerge and continues as gene families diversify and are refined. Surprisingly, adaptive evolution of group-I phospholipases in elapids is also associated with speciation events, suggesting adaptation of the phospholipase arsenal to novel prey species after niche shifts. Mapping the location of sites under positive selection onto the crystal structure of phospholipase A2 identified regions evolving under diversifying selection are located on the molecular surface and are likely protein-protein interactions sites essential for toxin functions. These data show that increases in genomic complexity (through gene duplications) can lead to phenotypic complexity (venom composition) and that positive Darwinian selection is a common evolutionary force in snake venoms. Finally, regions identified under selection on the surface of phospholipase A2 enzymes are potential candidate sites for structure based antivenin design.
TL;DR: It is proposed that the yeast WGD was probably an autopolyploidization, and the patterns of gene loss changed over time, parallels an increasing restriction of reciprocal gene loss to more slowly evolving gene pairs over time and suggests that, as duplicate genes diverged, one gene copy became favored over the other.
Abstract: Among yeasts that underwent whole-genome duplication (WGD), Kluyveromyces polysporus represents the lineage most distant from Saccharomyces cerevisiae. By sequencing the K. polysporus genome and comparing it with the S. cerevisiae genome using a likelihood model of gene loss, we show that these species diverged very soon after the WGD, when their common ancestor contained >9,000 genes. The two genomes subsequently converged onto similar current sizes (5,600 protein-coding genes each) and independently retained sets of duplicated genes that are strikingly similar. Almost half of their surviving single-copy genes are not orthologs but paralogs formed by WGD, as would be expected if most gene pairs were resolved independently. In addition, by comparing the pattern of gene loss among K. polysporus, S. cerevisiae, and three other yeasts that diverged after the WGD, we show that the patterns of gene loss changed over time. Initially, both members of a duplicate pair were equally likely to be lost, but loss of the same gene copy in independent lineages was increasingly favored at later time points. This trend parallels an increasing restriction of reciprocal gene loss to more slowly evolving gene pairs over time and suggests that, as duplicate genes diverged, one gene copy became favored over the other. The apparent low initial sequence divergence of the gene pairs leads us to propose that the yeast WGD was probably an autopolyploidization.
TL;DR: The resistance (R) gene Pi37, present in the rice cultivar St. No. 1, was isolated by an in silico map-based cloning procedure and complementation analysis revealed Pi37-3 to be the functional gene, while -1, -2, and -4 are probably pseudogenes.
Abstract: The resistance (R) gene Pi37, present in the rice cultivar St. No. 1, was isolated by an in silico map-based cloning procedure. The equivalent genetic region in Nipponbare contains four nucleotide binding site–leucine-rich repeat (NBS–LRR) type loci. These four candidates for Pi37 (Pi37-1, -2, -3, and -4) were amplified separately from St. No. 1 via long-range PCR, and cloned into a binary vector. Each construct was individually transformed into the highly blast susceptible cultivar Q1063. The subsequent complementation analysis revealed Pi37-3 to be the functional gene, while -1, -2, and -4 are probably pseudogenes. Pi37 encodes a 1290 peptide NBS–LRR product, and the presence of substitutions at two sites in the NBS region (V239A and I247M) is associated with the resistance phenotype. Semiquantitative expression analysis showed that in St. No. 1, Pi37 was constitutively expressed and only slightly induced by blast infection. Transient expression experiments indicated that the Pi37 product is restricted to the cytoplasm. Pi37-3 is thought to have evolved recently from -2, which in turn was derived from an ancestral -1 sequence. Pi37-4 is likely the most recently evolved member of the cluster and probably represents a duplication of -3. The four Pi37 paralogs are more closely related to maize rp1 than to any of the currently isolated rice blast R genes Pita, Pib, Pi9, Pi2, Piz-t, and Pi36.
TL;DR: In rainbow trout, transactivation and transrepression activities of the twoGRs show marked differences in their sensitivity to glucocorticoids, suggesting a mechanism that may allow the two GRs to control different physiological pathways.
TL;DR: The results support an important role of the FSGD and other types of duplication in the evolution of pigmentation in fish, and suggest teleost fishes apparently have a greater repertoire of pigment synthesis genes than any other vertebrate group.
Abstract: Coloration and color patterning belong to the most diverse phenotypic traits in animals. Particularly, teleost fishes possess more pigment cell types than any other group of vertebrates. As the result of an ancient fish-specific genome duplication (FSGD), teleost genomes might contain more copies of genes involved in pigment cell development than tetrapods. No systematic genomic inventory allowing to test this hypothesis has been drawn up so far for pigmentation genes in fish, and almost nothing is known about the evolution of these genes in different fish lineages. Using a comparative genomic approach including phylogenetic reconstructions and synteny analyses, we have studied two major pigment synthesis pathways in teleost fish, the melanin and the pteridine pathways, with respect to different types of gene duplication. Genes encoding three of the four enzymes involved in the synthesis of melanin from tyrosine have been retained as duplicates after the FSGD. In the pteridine pathway, two cases of duplicated genes originating from the FSGD as well as several lineage-specific gene duplications were observed. In both pathways, genes encoding the rate-limiting enzymes, tyrosinase and GTP-cyclohydrolase I (GchI), have additional paralogs in teleosts compared to tetrapods, which have been generated by different modes of duplication. We have also observed a previously unrecognized diversity of gchI genes in vertebrates. In addition, we have found evidence for divergent resolution of duplicated pigmentation genes, i.e., differential gene loss in divergent teleost lineages, particularly in the tyrosinase gene family. Mainly due to the FSGD, teleost fishes apparently have a greater repertoire of pigment synthesis genes than any other vertebrate group. Our results support an important role of the FSGD and other types of duplication in the evolution of pigmentation in fish.
TL;DR: It is found that the difference in dispensability observed between the two duplicate types is limited to gene products found within protein complexes, and probably results from differences in the relative strength of the evolutionary pressures present following each type of duplication event.
Abstract: Genes in populations are in constant flux, being gained through duplication and occasionally retained or, more frequently, lost from the genome. In this study we compare pairs of identifiable gene duplicates generated by small-scale (predominantly single-gene) duplications with those created by a large-scale gene duplication event (whole-genome duplication) in the yeast Saccharomyces cerevisiae. We find a number of quantifiable differences between these data sets. Whole-genome duplicates tend to exhibit less profound phenotypic effects when deleted, are functionally less divergent, and are associated with a different set of functions than their small-scale duplicate counterparts. At first sight, either of these latter two features could provide a plausible mechanism by which the difference in dispensability might arise. However, we uncover no evidence suggesting that this is the case. We find that the difference in dispensability observed between the two duplicate types is limited to gene products found within protein complexes, and probably results from differences in the relative strength of the evolutionary pressures present following each type of duplication event. Genes, and the proteins they specify, originating from small-scale and whole-genome duplication events differ in quantifiable ways. We infer that this is not due to their association with different functional categories; rather, it is a direct result of biases in gene retention.
TL;DR: Investigation of evolutionary links between the insect Halloween genes and vertebrate steroidogenic P450s suggest that they originated from common ancestors, perhaps destined for steroidogenesis, before the deuterostome-arthropod split.
TL;DR: Based on a large collection of EST sequences, evidence is provided that the haploid moss Physcomitrella patens is a paleopolyploid as well and metabolic genes seem to have been retained in excess following the genome duplication in P. patens.
Abstract: Analyses of complete genomes and large collections of gene transcripts have shown that most, if not all seed plants have undergone one or more genome duplications in their evolutionary past. In this study, based on a large collection of EST sequences, we provide evidence that the haploid moss Physcomitrella patens is a paleopolyploid as well. Based on the construction of linearized phylogenetic trees we infer the genome duplication to have occurred between 30 and 60 million years ago. Gene Ontology and pathway association of the duplicated genes in P. patens reveal different biases of gene retention compared with seed plants. Metabolic genes seem to have been retained in excess following the genome duplication in P. patens. This might, at least partly, explain the versatility of metabolism, as described for P. patens and other mosses, in comparison to other land plants.
TL;DR: The current progress in understanding the evolution of plant miRNA gene families is summarized in this review, suggesting recent expansion via tandem gene duplication and segmental duplication events.
Abstract: MicroRNAs (miRNAs) are important post-transcriptional regulators of their target genes in plants and animals. miRNAs are usually 20-24 nucleotides long. Despite their unusually small sizes, the evolutionary history of miRNA gene families seems to be similar to their protein-coding counterparts. In contrast to the small but abundant miRNA families in the animal genomes, plants have fewer but larger miRNA gene families. Members of plant miRNA gene families are often highly similar, suggesting recent expansion via tandem gene duplication and segmental duplication events. Although many miRNA genes are conserved across plant species, the same gene family varies significantly in size and genomic organization in different species, which may cause dosage effects and spatial and temporal differences in target gene regulations. In this review, we summarize the current progress in understanding the evolution of plant miRNA gene families.
TL;DR: The gene dosage and expression profiles generated here have enabled the identification of focal amplicons characteristic for the CC genome and facilitated the validation of relevant genes in these amplicons.
Abstract: Cervical cancer (CC) cells exhibit complex karyotypic alterations, which is consistent with deregulation of numerous critical genes in its formation and progression. To characterize this karyotypic complexity at the molecular level, we used cDNA array comparative genomic hybridization (aCGH) to analyze 29 CC cases and identified a number of over represented and deleted genes. The aCGH analysis revealed at least 17 recurrent amplicons and six common regions of deletions. These regions contain several known tumor-associated genes, such as those involved in transcription, apoptosis, cytoskeletal remodeling, ion-transport, drug metabolism, and immune response. Using the fluorescence in situ hybridization (FISH) approach we demonstrated the presence of high-level amplifications at the 8q24.3, 11q22.2, and 20q13 regions in CC cell lines. To identify amplification-associated genes that correspond to focal amplicons, we examined one or more genes in each of the 17 amplicons by Affymetrix U133A expression arrays and semiquantitative reverse-transcription PCR (RT-PCR) in 31 CC tumors. This analysis exhibited frequent and robust upregulated expression in CC relative to normal cervix for genes EPHB2 (1p36), CDCA8 (1p34.3), AIM2 (1q22-23), RFC4, MUC4, and HRASLS (3q27-29), SKP2 (5p12-13), CENTD3 (5q31.3), PTK2, RECQL4 (8q24), MMP1 and MMP13 (11q22.2), AKT1 (14q32.3), ABCC3 (17q21-22), SMARCA4 (19p13.3) LIG1 (19q13.3), UBE2C (20q13.1), SMC1L1 (Xp11), KIF4A (Xq12), TMSNB (Xq22), and CSAG2 (Xq28). Thus, the gene dosage and expression profiles generated here have enabled the identification of focal amplicons characteristic for the CC genome and facilitated the validation of relevant genes in these amplicons. These data, thus, form an important step toward the identification of biologically relevant genes in CC pathogenesis. This article contains Supplementary Material available at http://www.interscience.wiley.com/jpages/1045-2257/suppmat.
TL;DR: Molecular evidence is provided for the occurrence of several (at least 3) independent duplications of the ace-1 locus in the mosquito Culex pipiens, selected in response to insecticide pressure that probably occurred very recently (<40 years ago).
Abstract: Gene duplication is thought to be the main potential source of material for the evolution of new gene functions. Several models have been proposed for the evolution of new functions through duplication, most based on ancient events (Myr). We provide molecular evidence for the occurrence of several (at least 3) independent duplications of the ace-1 locus in the mosquito Culex pipiens, selected in response to insecticide pressure that probably occurred very recently (<40 years ago). This locus encodes the main target of several insecticides, the acetylcholinesterase. The duplications described consist of 2 alleles of ace-1, 1 susceptible and 1 resistant to insecticide, located on the same chromosome. These events were detected in different parts of the world and probably resulted from distinct mechanisms. We propose that duplications were selected because they reduce the fitness cost associated with the resistant ace-1 allele through the generation of persistent, advantageous heterozygosis. The rate of duplication of ace-1 in C. pipiens is probably underestimated, but seems to be rather high.
TL;DR: The V2R genes are expressed in the mammalian vomeronasal organ, and their products are involved in detecting pheromones, and it is found that the human, chimpanzee, macaque, cow and dog V1R gene families have completely degenerated.
TL;DR: Examination of TSC DNA samples for large deletion/duplication mutations using multiplex ligation-dependent probe amplification (MLPA) probe sets concludes that large deletions in TSC1 and TSC2 account for about 0.5 and 6%, respectively, and MLPA is a highly sensitive and accurate detection method, including for mosaicism.
Abstract: Tuberous sclerosis (TSC) is an autosomal dominant disorder caused by mutations in either of two genes, TSC1 and TSC2. Point mutations and small indels account for most TSC1 and TSC2 mutations. We examined 261 TSC DNA samples (209 small-mutation-negative and 52 unscreened) for large deletion/duplication mutations using multiplex ligation-dependent probe amplification (MLPA) probe sets designed to permit interrogation of all TSC1/2 exons, as well as 15–50 kb of flanking sequence. Large deletion/duplication mutations in TSC1 and TSC2 were identified in 54 patients, of which 50 were in TSC2, and 4 were in TSC1. All but two mutations were deletions. Only 13 deletions were intragenic in TSC2, and one in TSC1, so that 39 (73%) deletions extended beyond the 5′, 3′ or both ends of TSC1 or TSC2. Mutations were identified in 24% of small-mutation-negative and 8% of unscreened samples. Eight of 54 (15%) mutations were mosaic, affecting 34–62% of cells. All intragenic mutations were confirmed by LR-PCR. Genotype/phenotype analysis showed that all (21 of 21) patients with TSC2 deletions extending 3′ into the PKD1 gene had kidney cysts. Breakpoints of intragenic deletions were randomly distributed along the TSC2 sequence, and did not preferentially involve repeat sequence elements. Our own 20-plex probe sets gave more robust performance than the 40-plex probe sets from MRC-Holland. We conclude that large deletions in TSC1 and TSC2 account for about 0.5 and 6% of mutations seen in TSC patients, respectively, and MLPA is a highly sensitive and accurate detection method, including for mosaicism.
TL;DR: The evolution of the teleost fish CC chemokine gene family is reviewed, noting evidence of widespread tandem gene duplications and examining the implications of this phenomenon on immune diversity.
Abstract: Chemokines are a superfamily of cytokines responsible for regulating cell migration under both inflammatory and physiological conditions. CC chemokines are the largest subfamily of chemokines, with 28 members in humans. A subject of intense study in mammalian species, the known functional roles of CC chemokines ligands in both developmental and disease conditions continue to expand. They are also an important family for the study of gene copy number variation and tandem duplication in mammalian species. However, little is known regarding the evolutionary origin and status of these ligands in primitive vertebrates such as teleost fish. In this paper, we review the evolution of the teleost fish CC chemokine gene family, noting evidence of widespread tandem gene duplications and examining the implications of this phenomenon on immune diversity. Through extensive phylogenetic analysis of the CC chemokine sets of four teleost species, zebrafish, catfish, rainbow trout, and Atlantic salmon, we identified seven large groups of CC chemokines. It appeared that several major groups of CC chemokines are highly related including the CCL19/21/25 group, the CCL20 group, CCL27/28 group, and the fish-specific group. In the three remaining groups that contained the largest number of members, the CCL17/22 group, the MIP group, and the MCP group, similarities among species members were obscured by rapid, tandem duplications that may contribute to immune diversity.
TL;DR: Accurate identification of gene duplication events is essential to avoid false‐positive ultrarapid metabolism assignments and thus, overestimation of predicted activity and increased risk for unwanted adverse events.
Abstract: Duplications and multiplications of active CYP2D6 genes can cause ultrarapid drug metabolism and lead to therapeutic failure. Multiple functional and non-functional duplication alleles have been further characterized. Duplications were detected by long-range polymerase chain reaction (PCR), PCR-restriction fragment length polymorphism, and sequence analysis. A PCR fragment encompassing the entire duplicated gene was utilized for detailed characterization. Duplications occurred at 1.3, 5.75, and 2.0% in Caucasian, African American, and racially mixed populations, respectively (n=887 total). Of those 28, 47, and 17% were non-functional CYP2D6*4 x N. Twelve unique duplication alleles were detected: *1 x N, *2 x N, *4 x N, *6 x N, *10 x N, *17 x N, *17 x N[spacer], *29 x N, *35 x N, *43 x N, *45 x N, and a novel non-functional tandem arrangement of a chimeric 2D7/2D6 and *1 gene. All novel duplications except *35 x N were found in African Americans. Accurate identification of gene duplication events is essential to avoid false-positive ultrarapid metabolism assignments and thus, overestimation of predicted activity and increased risk for unwanted adverse events.