TL;DR: Comprehensive phylogenomic analyses of sequenced plant genomes and more than 12.6 million new expressed-sequence-tag sequences from phylogenetically pivotal lineages are used to elucidate two groups of ancient gene duplications, implicating two WGDs in ancestral lineages shortly before the diversification of extant seed plants and extant angiosperms.
Abstract: Whole-genome duplication (WGD), or polyploidy, followed by gene loss and diploidization has long been recognized as an important evolutionary force in animals, fungi and other organisms, especially plants. The success of angiosperms has been attributed, in part, to innovations associated with gene or whole-genome duplications, but evidence for proposed ancient genome duplications pre-dating the divergence of monocots and eudicots remains equivocal in analyses of conserved gene order. Here we use comprehensive phylogenomic analyses of sequenced plant genomes and more than 12.6 million new expressed-sequence-tag sequences from phylogenetically pivotal lineages to elucidate two groups of ancient gene duplications-one in the common ancestor of extant seed plants and the other in the common ancestor of extant angiosperms. Gene duplication events were intensely concentrated around 319 and 192 million years ago, implicating two WGDs in ancestral lineages shortly before the diversification of extant seed plants and extant angiosperms, respectively. Significantly, these ancestral WGDs resulted in the diversification of regulatory genes important to seed and flower development, suggesting that they were involved in major innovations that ultimately contributed to the rise and eventual dominance of seed plants and angiosperms.
TL;DR: The Daphnia genome reveals a multitude of genes and shows adaptation through gene family expansions, and the coexpansion of gene families interacting within metabolic pathways suggests that the maintenance of duplicated genes is not random.
Abstract: We describe the draft genome of the microcrustacean Daphnia pulex, which is only 200 megabases and contains at least 30,907 genes. The high gene count is a consequence of an elevated rate of gene duplication resulting in tandem gene clusters. More than a third of Daphnia's genes have no detectable homologs in any other available proteome, and the most amplified gene families are specific to the Daphnia lineage. The coexpansion of gene families interacting within metabolic pathways suggests that the maintenance of duplicated genes is not random, and the analysis of gene expression under different environmental conditions reveals that numerous paralogs acquire divergent expression patterns soon after duplication. Daphnia-specific genes, including many additional loci within sequenced regions that are otherwise devoid of annotations, are the most responsive genes to ecological challenges.
TL;DR: De novo evolution out of non-coding genomic regions is emerging as an important additional mechanism for the evolution of new gene functions, which can become relevant for lineage-specific adaptations.
Abstract: Gene evolution has long been thought to be primarily driven by duplication and rearrangement mechanisms. However, every evolutionary lineage harbours orphan genes that lack homologues in other lineages and whose evolutionary origin is only poorly understood. Orphan genes might arise from duplication and rearrangement processes followed by fast divergence; however, de novo evolution out of non-coding genomic regions is emerging as an important additional mechanism. This process appears to provide raw material continuously for the evolution of new gene functions, which can become relevant for lineage-specific adaptations.
TL;DR: It is proposed that the universal bias in gene loss between the genomes of this ancient tetraploid, and perhaps all tetraPLoids, is the result of selection against loss of the gene responsible for the majority of total expression for a duplicate gene pair.
Abstract: Ancient tetraploidies are found throughout the eukaryotes. After duplication, one copy of each duplicate gene pair tends to be lost (fractionate). For all studied tetraploidies, the loss of duplicated genes, known as homeologs, homoeologs, ohnologs, or syntenic paralogs, is uneven between duplicate regions. In maize, a species that experienced a tetraploidy 5–12 million years ago, we show that in addition to uneven ancient gene loss, the two complete genomes contained within maize are differentiated by ongoing fractionation among diverse inbreds as well as by a pattern of overexpression of genes from the genome that has experienced less gene loss. These expression differences are consistent over a range of experiments quantifying RNA abundance in different tissues. We propose that the universal bias in gene loss between the genomes of this ancient tetraploid, and perhaps all tetraploids, is the result of selection against loss of the gene responsible for the majority of total expression for a duplicate gene pair. Although the tetraploidy of maize is ancient, biased gene loss and expression continue today and explain, at least in part, the remarkable genetic diversity found among modern maize cultivars.
TL;DR: It is shown that paralogs share most protein–protein interactions and genetic regulators, whereas xenologs share very few of them, which suggests that gene transfer and gene duplication have very different roles in shaping the evolution of biological systems.
Abstract: Gene duplication followed by neo- or sub-functionalization deeply impacts the evolution of protein families and is regarded as the main source of adaptive functional novelty in eukaryotes. While there is ample evidence of adaptive gene duplication in prokaryotes, it is not clear whether duplication outweighs the contribution of horizontal gene transfer in the expansion of protein families. We analyzed closely related prokaryote strains or species with small genomes (Helicobacter, Neisseria, Streptococcus, Sulfolobus), average-sized genomes (Bacillus, Enterobacteriaceae), and large genomes (Pseudomonas, Bradyrhizobiaceae) to untangle the effects of duplication and horizontal transfer. After removing the effects of transposable elements and phages, we show that the vast majority of expansions of protein families are due to transfer, even among large genomes. Transferred genes—xenologs—persist longer in prokaryotic lineages possibly due to a higher/longer adaptive role. On the other hand, duplicated genes—paralogs—are expressed more, and, when persistent, they evolve slower. This suggests that gene transfer and gene duplication have very different roles in shaping the evolution of biological systems: transfer allows the acquisition of new functions and duplication leads to higher gene dosage. Accordingly, we show that paralogs share most protein–protein interactions and genetic regulators, whereas xenologs share very few of them. Prokaryotes invented most of life's biochemical diversity. Therefore, the study of the evolution of biology systems should explicitly account for the predominant role of horizontal gene transfer in the diversification of protein families.
TL;DR: It is demonstrated that there is little variation in unique gene content across Leishmania species, but large-scale genetic heterogeneity can result through gene amplification on disomic chromosomes and variation in chromosome number.
Abstract: Leishmania parasites cause a spectrum of clinical pathology in humans ranging from disfiguring cutaneous lesions to fatal visceral leishmaniasis. We have generated a reference genome for Leishmania mexicana and refined the reference genomes for Leishmania major, Leishmania infantum, and Leishmania braziliensis. This has allowed the identification of a remarkably low number of genes or paralog groups (2, 14, 19, and 67, respectively) unique to one species. These were found to be conserved in additional isolates of the same species. We have predicted allelic variation and find that in these isolates, L. major and L. infantum have a surprisingly low number of predicted heterozygous SNPs compared with L. braziliensis and L. mexicana. We used short read coverage to infer ploidy and gene copy numbers, identifying large copy number variations between species, with 200 tandem gene arrays in L. major and 132 in L. mexicana. Chromosome copy number also varied significantly between species, with nine supernumerary chromosomes in L. infantum, four in L. mexicana, two in L. braziliensis, and one in L. major. A significant bias against gene arrays on supernumerary chromosomes was shown to exist, indicating that duplication events occur more frequently on disomic chromosomes. Taken together, our data demonstrate that there is little variation in unique gene content across Leishmania species, but large-scale genetic heterogeneity can result through gene amplification on disomic chromosomes and variation in chromosome number. Increased gene copy number due to chromosome amplification may contribute to alterations in gene expression in response to environmental conditions in the host, providing a genetic basis for disease tropism.
TL;DR: This mini-review summarises the involvement of gene duplication/amplification in the insecticide/acaricide resistance of insect and mite pests and highlights recent developments in this area in relation to P450-mediated and target-site resistance.
Abstract: Pesticide resistance in arthropods has been shown to evolve by two main mechanisms, the enhanced production of metabolic enzymes, which bind to and/or detoxify the pesticide, and mutation of the target protein, which makes it less sensitive to the pesticide. One route that leads to enhanced metabolism is the duplication or amplification of the structural gene(s) encoding the detoxifying enzyme, and this has now been described for the three main families (esterases, glutathione S-transferases and cytochrome P450 monooxygenases) implicated in resistance. More recently, a direct or indirect role for gene duplication or amplification has been described for target-site resistance in several arthropod species. This mini-review summarises the involvement of gene duplication/amplification in the insecticide/acaricide resistance of insect and mite pests and highlights recent developments in this area in relation to P450-mediated and target-site resistance.
TL;DR: The results suggest that genome duplication transforms features of A. borealis in a manner that confers adaptation to a novel environment, and it is shown that hexaploids have a fivefold fitness advantage over tetraploids in dune habitats.
Abstract: Chromosome evolution in flowering plants is often punctuated by polyploidy, genome duplication events that fundamentally alter DNA content, chromosome number, and gene dosage Polyploidy confers postzygotic reproductive isolation and is thought to drive ecological divergence and range expansion The adaptive value of polyploidy, however, remains uncertain; ecologists have traditionally relied on observational methods that cannot distinguish effects of polyploidy per se from genic differences that accumulate after genome duplication Here I use an experimental approach to test how polyploidy mediates ecological divergence in Achillea borealis (Asteraceae), a widespread tetraploid plant with localized hexaploid populations In coastal California, tetraploids and hexaploids occupy mesic grassland and xeric dune habitats, respectively Using field transplant experiments with wild-collected plants, I show that hexaploids have a fivefold fitness advantage over tetraploids in dune habitats Parallel experiments with neohexaploids—first-generation mutants screened from a tetraploid genetic background—reveal that a 70% fitness advantage is achieved via genome duplication per se These results suggest that genome duplication transforms features of A borealis in a manner that confers adaptation to a novel environment
TL;DR: The 44K and 22K microarray results suggest that 53 and 52 non-redundant genes in this family were up-regulated in response to biotic and abiotic stresses, respectively.
Abstract: We identified 163 AP2/EREBP (APETALA2/ethylene-responsive element-binding protein) genes in rice. We analyzed gene structures, phylogenies, domain duplication, genome localizations and expression profiles. Conserved amino acid residues and phylogeny construction using the AP2/ERF conserved domain sequence suggest that in rice the OsAP2/EREBP gene family can be classified broadly into four subfamilies [AP2, RAV (related to ABI3/VP1), DREB (dehydration-responsive element-binding protein) and ERF (ethylene-responsive factor)]. The chromosomal localizations of the OsAP2/EREBP genes indicated 20 segmental duplication events involving 40 genes; 58 redundant OsAP2/EREBP genes were involved in tandem duplication events. There were fewer introns after segmental duplication. We investigated expression profiles of this gene family under biotic stresses [infection with rice viruses such as rice stripe virus (RSV), rice tungro spherical virus (RTSV) and rice dwarf virus (RDV, three virus strains S, O and D84)], and various abiotic stresses. Symptoms of virus infection were more severe in RSV infection than in RTSV and RDV infection. Responses to biotic stresses are novel findings and these stresses enhance the ability to identify the best candidate genes for further functional analysis. The genes of subgroup B-5 were not induced under abiotic treatments whereas they were activated by the three RDV strains. None of the genes of subgroups A-3 were differentially expressed by any of the biotic stresses. Our 44K and 22K microarray results suggest that 53 and 52 non-redundant genes in this family were up-regulated in response to biotic and abiotic stresses, respectively. We further examined the stress responsiveness of most genes by reverse transcription-PCR. The study results should be useful in selecting candidate genes from specific subgroups for functional analysis.
TL;DR: A genetic screen based on tRNA-mediated suppression in a Schizosaccharomyces pombe La protein (Sla1p) mutant found a duplication of the tRNASerUCA-C47:6U gene, which was shown to cause the phenotype, and mtDNA from the authors' strain and yFS101 shared 14 mtSNPs relative to a ‘reference’ mtDNA, providing the first identification of these S. pom be mtDNA discrepancies.
Abstract: We used a genetic screen based on tRNA-mediated suppression (TMS) in a Schizosaccharomyces pombe La protein (Sla1p) mutant. Suppressor pre-tRNA Ser UCA-C47:6U with a debilitating substitution in its variable arm fails to produce tRNA in a sla1-rrm mutant deficient for RNA chaperone-like activity. The parent strain and spontaneous mutant were analyzed using Solexa sequencing. One synonymous single-nucleotide polymorphism (SNP), unrelated to the phenotype, was identified. Further sequence analyses found a duplication of the tRNA Ser UCA-C47:6U gene, which was shown to cause the phenotype. Ninety percent of 28 isolated mutants contain duplicated tRNA Ser UCA-C47:6U genes. The tRNA gene duplication led to a disproportionately large increase in tRNA Ser UCA-C47:6U levels in sla1-rrm but not sla1-null cells, consistent with non-specific low-affinity interactions contributing to the RNA chaperone-like activity of La, similar to other RNA chaperones. Our analysis also identified 24 SNPs between ours and S. pombe 972h- strain yFS101 that was recently sequenced using Solexa. By including mitochondrial (mt) DNA in our analysis, overall coverage increased from 52% to 96%. mtDNA from our strain and yFS101 shared 14 mtSNPs relative to a ‘reference’ mtDNA, providing the first identification of these S. pombe mtDNA discrepancies. Thus, strain-specific and spontaneous phenotypic mutations can be mapped in S. pombe by Solexa sequencing.
TL;DR: The results suggest that Ube3a gene dosage may contribute to the autism traits of individuals with maternal 15q11-13 duplication and support the idea that increased E3A ubiquitin ligase gene dosage results in reduced excitatory synaptic transmission.
Abstract: People with autism spectrum disorder are characterized by impaired social interaction, reduced communication, and increased repetitive behaviors. The disorder has a substantial genetic component, and recent studies have revealed frequent genome copy number variations (CNVs) in some individuals. A common CNV that occurs in 1 to 3% of those with autism—maternal 15q11-13 duplication (dup15) and triplication (isodicentric extranumerary chromosome, idic15)—affects several genes that have been suggested to underlie autism behavioral traits. To test this, we tripled the dosage of one of these genes, the ubiquitin protein ligase Ube3a, which is expressed solely from the maternal allele in mature neurons, and reconstituted the three core autism traits in mice: defective social interaction, impaired communication, and increased repetitive stereotypic behavior. The penetrance of these autism traits depended on Ube3a gene copy number. In animals with increased Ube3a gene dosage, glutamatergic, but not GABAergic, synaptic transmission was suppressed as a result of reduced presynaptic release probability, synaptic glutamate concentration, and postsynaptic action potential coupling. These results suggest that Ube3a gene dosage may contribute to the autism traits of individuals with maternal 15q11-13 duplication and support the idea that increased E3A ubiquitin ligase gene dosage results in reduced excitatory synaptic transmission.
TL;DR: Data obtained from the investigation contributes to a better understanding of the complexity of the maize Hsf gene family and provides the first step towards directing future experimentation designed to perform systematic analysis of the functions of the HSF gene family.
Abstract: Heat shock response in eukaryotes is transcriptionally regulated by conserved heat shock transcription factors (Hsfs). Hsf genes are represented by a large multigene family in plants and investigation of the Hsf gene family will serve to elucidate the mechanisms by which plants respond to stress. In recent years, reports of genome-wide structural and evolutionary analysis of the entire Hsf gene family have been generated in two model plant systems, Arabidopsis and rice. Maize, an important cereal crop, has represented a model plant for genetics and evolutionary research. Although some Hsf genes have been characterized in maize, analysis of the entire Hsf gene family were not completed following Maize (B73) Genome Sequencing Project. A genome-wide analysis was carried out in the present study to identify all Hsfs maize genes. Due to the availability of complete maize genome sequences, 25 nonredundant Hsf genes, named ZmHsfs were identified. Chromosomal location, protein domain and motif organization of ZmHsfs were analyzed in maize genome. The phylogenetic relationships, gene duplications and expression profiles of ZmHsf genes were also presented in this study. Twenty-five ZmHsfs were classified into three major classes (class A, B, and C) according to their structural characteristics and phylogenetic comparisons, and class A was further subdivided into 10 subclasses. Moreover, phylogenetic analysis indicated that the orthologs from the three species (maize, Arabidopsis and rice) were distributed in all three classes, it also revealed diverse Hsf gene family expression patterns in classes and subclasses. Chromosomal/segmental duplications played a key role in Hsf gene family expansion in maize by investigation of gene duplication events. Furthermore, the transcripts of 25 ZmHsf genes were detected in the leaves by heat shock using quantitative real-time PCR. The result demonstrated that ZmHsf genes exhibit different expression levels in heat stress treatment. Overall, data obtained from our investigation contributes to a better understanding of the complexity of the maize Hsf gene family and provides the first step towards directing future experimentation designed to perform systematic analysis of the functions of the Hsf gene family.
TL;DR: A new aspect of the rRNA gene repeat (called rDNA) is introduced as a center of maintenance of genome integrity and its contribution to evolution is discussed.
Abstract: The genes encoding ribosomal RNA (rRNA) are the most abundant genes in the eukaryotic genome. They reside in tandem repetitive clusters, in some cases totaling hundreds of copies. Due to their repetitive structure and highly active transcription, the rRNA gene repeats are some of the most fragile sites in the chromosome. A unique gene amplification system compensates for loss of copies, thus maintaining copy number, albeit with some fluctuations. The unusual nature of rRNA gene repeats affects cellular functions such as senescence. In addition, we recently found that the repeat number determines sensitivity to DNA damage. In this review, I would like to introduce a new aspect of the rRNA gene repeat (called rDNA) as a center of maintenance of genome integrity and discuss its contribution to evolution.
TL;DR: The data suggest that an extra copy of the encompassed TBK1 gene is likely responsible for these cases of glaucoma, and animal studies will be necessary to rule out a role for the other duplicated or neighboring genes.
Abstract: We report identification of a novel genetic locus (GLC1P) for normal tension glaucoma (NTG) on chromosome 12q14 using linkage studies of an African-American pedigree (maximum non-parametric linkage score = 19.7, max LOD score = 2.7). Subsequent comparative genomic hybridization and quantitative polymerase chain reaction (PCR) experiments identified a 780 kbp duplication within the GLC1P locus that is co-inherited with NTG in the pedigree. Real-time PCR studies showed that the genes within this duplication [TBK1 (TANK-binding kinase 1), XPOT, RASSF3 and GNS] are all expressed in the human retina. Cohorts of 478 glaucoma patients (including 152 NTG patients), 100 normal control subjects and 400 age-related macular degeneration patients were subsequently tested for copy number variation in GLC1P. Overlapping duplications were detected in 2 (1.3%) of the 152 NTG subjects, one of which had a strong family history of glaucoma. These duplications defined a 300 kbp critical region of GLC1P that spans two genes (TBK1 and XPOT). Microarray expression experiments and northern blot analysis using RNA obtained from human skin fibroblast cells showed that duplication of chromosome 12q14 results in increased TBK1 and GNS transcription. Finally, immunohistochemistry studies showed that TBK1 is expressed in the ganglion cells, nerve fiber layer and microvasculature of the human retina. Together, these data link the duplication of genes on chromosome 12q14 with familial NTG and suggest that an extra copy of the encompassed TBK1 gene is likely responsible for these cases of glaucoma. However, animal studies will be necessary to rule out a role for the other duplicated or neighboring genes.
TL;DR: This paper describes a combinatorial model where so-called DTL-scenarios are used to explain the differences between a gene tree and a corresponding species tree taking into account gene duplications, gene losses, and lateral gene transfers (also known as horizontal gene transfers).
Abstract: The incongruency between a gene tree and a corresponding species tree can be attributed to evolutionary events such as gene duplication and gene loss. This paper describes a combinatorial model where so-called DTL-scenarios are used to explain the differences between a gene tree and a corresponding species tree taking into account gene duplications, gene losses, and lateral gene transfers (also known as horizontal gene transfers). The reasonable biological constraint that a lateral gene transfer may only occur between contemporary species leads to the notion of acyclic DTL-scenarios. Parsimony methods are introduced by defining appropriate optimization problems. We show that finding most parsimonious acyclic DTL-scenarios is NP-hard. However, by dropping the condition of acyclicity, the problem becomes tractable, and we provide a dynamic programming algorithm as well as a fixed-parameter tractable algorithm for finding most parsimonious DTL-scenarios.
TL;DR: In this article, the authors demonstrated that Tn4401 is an active transposon capable of mobilizing bla(KPC)-2 genes at high frequency with a frequency of 4.4 × 10−6/recipient cell.
Abstract: The carbapenemase gene bla(KPC), which is rapidly spreading worldwide, is located on a Tn3-based transposon, Tn4401. In a transposition-conjugation assay, Tn4401 was able to mobilize bla(KPC-2) gene at a frequency of 4.4 × 10(-6)/recipient cell. A 5-bp target site duplication was evidenced upon each insertion without target site specificity. This study demonstrated that Tn4401 is an active transposon capable of mobilizing bla(KPC) genes at high frequency.
TL;DR: Genes resulting from WGD and tandem duplications diverge slowest in both coding sequences and gene expression, and contribute most to genetic redundancy, while other duplication modes contribute more to evolutionary novelty.
Abstract: Background
Both single gene and whole genome duplications (WGD) have recurred in angiosperm evolution. However, the evolutionary effects of different modes of gene duplication, especially regarding their contributions to genetic novelty or redundancy, have been inadequately explored.
Results
In Arabidopsis thaliana and Oryza sativa (rice), species that deeply sample botanical diversity and for which expression data are available from a wide range of tissues and physiological conditions, we have compared expression divergence between genes duplicated by six different mechanisms (WGD, tandem, proximal, DNA based transposed, retrotransposed and dispersed), and between positional orthologs. Both neo-functionalization and genetic redundancy appear to contribute to retention of duplicate genes. Genes resulting from WGD and tandem duplications diverge slowest in both coding sequences and gene expression, and contribute most to genetic redundancy, while other duplication modes contribute more to evolutionary novelty. WGD duplicates may more frequently be retained due to dosage amplification, while inferred transposon mediated gene duplications tend to reduce gene expression levels. The extent of expression divergence between duplicates is discernibly related to duplication modes, different WGD events, amino acid divergence, and putatively neutral divergence (time), but the contribution of each factor is heterogeneous among duplication modes. Gene loss may retard inter-species expression divergence. Members of different gene families may have non-random patterns of origin that are similar in Arabidopsis and rice, suggesting the action of pan-taxon principles of molecular evolution.
Conclusion
Gene duplication modes differ in contribution to genetic novelty and redundancy, but show some parallels in taxa separated by hundreds of millions of years of evolution.
TL;DR: This analysis identified several novel dysregulated genes and miRNAs in ASD compared with controls, including HEY1, SOX9, miR-486 and miR -181b, and found significant enrichment in molecules associated with neurological disorders such as Rett syndrome and those associated with nervous system development and function including long-term potentiation.
TL;DR: Combined phenotypic and expression analysis indicated that, whereas 5AQ plays a major role in conferring domestication-related traits, 5Dq contributes directly and 5Bq indirectly to suppression of the speltoid phenotype, all contributing to the domestication traits.
Abstract: The Q gene encodes an AP2-like transcription factor that played an important role in domestication of polyploid wheat. The chromosome 5A Q alleles (5AQ and 5Aq) have been well studied, but much less is known about the q alleles on wheat homoeologous chromosomes 5B (5Bq) and 5D (5Dq). We investigated the organization, evolution, and function of the Q/q homoeoalleles in hexaploid wheat (Triticum aestivum L.). Q/q gene sequences are highly conserved within and among the A, B, and D genomes of hexaploid wheat, the A and B genomes of tetraploid wheat, and the A, S, and D genomes of the diploid progenitors, but the intergenic regions of the Q/q locus are highly divergent among homoeologous genomes. Duplication of the q gene 5.8 Mya was likely followed by selective loss of one of the copies from the A genome progenitor and the other copy from the B, D, and S genomes. A recent V(329)-to-I mutation in the A lineage is correlated with the Q phenotype. The 5Bq homoeoalleles became a pseudogene after allotetraploidization. Expression analysis indicated that the homoeoalleles are coregulated in a complex manner. Combined phenotypic and expression analysis indicated that, whereas 5AQ plays a major role in conferring domestication-related traits, 5Dq contributes directly and 5Bq indirectly to suppression of the speltoid phenotype. The evolution of the Q/q loci in polyploid wheat resulted in the hyperfunctionalization of 5AQ, pseudogenization of 5Bq, and subfunctionalization of 5Dq, all contributing to the domestication traits.
TL;DR: The identification of HA as a major risk factors for this canine disease raises the potential of this glycosaminoglycan as a risk factor for human periodic fevers and as an important driver of chronic inflammation.
Abstract: Hereditary periodic fever syndromes are characterized by recurrent episodes of fever and inflammation with no known pathogenic or autoimmune cause. In humans, several genes have been implicated in this group of diseases, but the majority of cases remain unexplained. A similar periodic fever syndrome is relatively frequent in the Chinese Shar-Pei breed of dogs. In the western world, Shar-Pei have been strongly selected for a distinctive thick and heavily folded skin. In this study, a mutation affecting both these traits was identified. Using genome-wide SNP analysis of Shar-Pei and other breeds, the strongest signal of a breed-specific selective sweep was located on chromosome 13. The same region also harbored the strongest genome-wide association (GWA) signal for susceptibility to the periodic fever syndrome (praw=2.3610 26 , pgenome=0.01). Dense targeted resequencing revealed two partially overlapping duplications, 14.3 Kb and 16.1 Kb in size, unique to Shar-Pei and upstream of the Hyaluronic Acid Synthase 2 (HAS2) gene. HAS2 encodes the rate-limiting enzyme synthesizing hyaluronan (HA), a major component of the skin. HA is up-regulated and accumulates in the thickened skin of Shar-Pei. A high copy number of the 16.1 Kb duplication was associated with an increased expression of HAS2 as well as the periodic fever syndrome (p,0.0001). When fragmented, HA can act as a trigger of the innate immune system and stimulate sterile fever and inflammation. The strong selection for the skin phenotype therefore appears to enrich for a pleiotropic mutation predisposing these dogs to a periodic fever syndrome. The identification of HA as a major risk factor for this canine disease raises the potential of this glycosaminoglycan as a risk factor for human periodic fevers and as an important driver of chronic inflammation.
TL;DR: This rapid and efficient Agrobacterium-mediated VIGS assay provides a very powerful tool for rapid large-scale analysis of gene functions at genome-wide level in cotton.
Abstract: Cotton (Gossypium hirsutum) is one of the most important crops worldwide Considerable efforts have been made on molecular breeding of new varieties The large-scale gene functional analysis in cotton has been lagged behind most of the modern plant species, likely due to its large size of genome, gene duplication and polyploidy, long growth cycle and recalcitrance to genetic transformation(1) To facilitate high throughput functional genetic/genomic study in cotton, we attempt to develop rapid and efficient transient assays to assess cotton gene functions Virus-Induced Gene Silencing (VIGS) is a powerful technique that was developed based on the host Post-Transcriptional Gene Silencing (PTGS) to repress viral proliferation(2,3) Agrobacterium-mediated VIGS has been successfully applied in a wide range of dicots species such as Solanaceae, Arabidopsis and legume species, and monocots species including barley, wheat and maize, for various functional genomic studies(3,4) As this rapid and efficient approach avoids plant transformation and overcomes functional redundancy, it is particularly attractive and suitable for functional genomic study in crop species like cotton not amenable for transformation In this study, we report the detailed protocol of Agrobacterium-mediated VIGS system in cotton Among the several viral VIGS vectors, the tobacco rattle virus (TRV) invades a wide range of hosts and is able to spread vigorously throughout the entire plant yet produce mild symptoms on the hosts5 To monitor the silencing efficiency, GrCLA1, a homolog gene of Arabidopsis Cloroplastos alterados 1 gene (AtCLA1) in cotton, has been cloned and inserted into the VIGS binary vector pYL156 CLA1 gene is involved in chloroplast development(6), and previous studies have shown that loss-of-function of AtCLA1 resulted in an albino phenotype on true leaves(7), providing an excellent visual marker for silencing efficiency At approximately two weeks post Agrobacterium infiltration, the albino phenotype started to appear on the true leaves, with 100% silencing efficiency in all replicated experiments The silencing of endogenous gene expression was also confirmed by RT-PCR analysis Significantly, silencing could potently occur in all the cultivars we tested, including various commercially grown varieties in Texas This rapid and efficient Agrobacterium-mediated VIGS assay provides a very powerful tool for rapid large-scale analysis of gene functions at genome-wide level in cotton
TL;DR: All 33 OsATG homologues could be detected in at least one cell type of the various tissues under normal or stress growth conditions, but their expression was tightly regulated and 10 duplicated genes showed expression divergence.
Abstract: Autophagy is an intracellular degradation process for recycling macromolecules and organelles. It plays important roles in plant development and in response to nutritional demand, stress, and senescence. Organisms from yeast to plants contain many autophagy-associated genes (ATG). In this study, we found that a total of 33 ATG homologues exist in the rice [Oryza sativa L. (Os)] genome, which were classified into 13 ATG subfamilies. Six of them are alternatively spliced genes. Evolutional analysis showed that expansion of 10 OsATG homologues occurred via segmental duplication events and that the occurrence of these OsATG homologues within each subfamily was asynchronous. The Ka/Ks ratios suggested purifying selection for four duplicated OsATG homologues and positive selection for two. Calculating the dates of the duplication events indicated that all duplication events might have occurred after the origin of the grasses, from 21.43 to 66.77 million years ago. Semi-quantitative RT‐PCR analysis and mining the digital expression database of rice showed that all 33 OsATG homologues could be detected in at least one cell type of the various tissues under normal or stress growth conditions, but their expression was tightly regulated. The 10 duplicated genes showed expression divergence. The expression of most OsATG homologues was regulated by at least one treatment, including hormones, abiotic and biotic stresses, and nutrient limitation. The identification of OsATG homologues showing constitutive expression or responses to environmental stimuli provides new insights for in-depth characterization of selected genes of importance in rice.
TL;DR: Functional studies using zebrafish indicate a conserved role of this ligand-receptor system in the regulation of cell survival and resistance to infectious disease.
Abstract: The tumor necrosis factor superfamily (TNFSF) and the TNF receptor superfamily (TNFRSF) have an ancient evolutionary origin that can be traced back to single copy genes within Arthropods. In humans, 18 TNFSF and 29 TNFRSF genes have been identified. Evolutionary models account for the increase in gene number primarily through multiple whole genome duplication events as well as by lineage and/or species-specific tandem duplication and translocation. The identification and functional analyses of teleost ligands and receptors provide insight into the critical transition between invertebrates and higher vertebrates. Bioinformatic analyses of fish genomes and EST datasets identify 14 distinct ligand groups, some of which are novel to teleosts, while to date, only limited numbers of receptors have been characterized in fish. The most studied ligand is TNF of which teleost species possess between 1 and 3 copies as well as a receptor similar to TNFR1. Functional studies using zebrafish indicate a conserved role of this ligand-receptor system in the regulation of cell survival and resistance to infectious disease. The increasing interest and use of TNFSF and TNFRSF modulators in human and animal medicine underscores the need to understand the evolutionary origins as well as conserved and novel functions of these biologically important molecules.
TL;DR: Overall, NOS family evolution was the result of multiple gene and genome duplication events together with changes in protein architecture, resulting in three isoforms--I, II, and III--in current mammals.
Abstract: Nitric oxide (NO) is essential to many physiological functions and operates in several signaling pathways. It is not understood how and when the different isoforms of nitric oxide synthase (NOS), the enzyme responsible for NO production, evolved in metazoans. This study investigates the number and structure of metazoan NOS enzymes by genome data mining and direct cloning of Nos genes from the lamprey. In total, 181 NOS proteins are analyzed from 33 invertebrate and 63 vertebrate species. Comparisons among protein and gene structures, combined with phylogenetic and syntenic studies, provide novel insights into how NOS isoforms arose and diverged. Protein domains and gene organization—that is, intron positions and phases—of animal NOS are remarkably conserved across all lineages, even in fast-evolving species. Phylogenetic and syntenic analyses support the view that a proto-NOS isoform was recurrently duplicated in different lineages, acquiring new structural configurations through gains and losses of protein motifs. We propose that in vertebrates a first duplication took place after the agnathan–gnathostome split followed by a paralog loss. A second duplication occurred during early tetrapod evolution, giving rise to the three isoforms—I, II, and III—in current mammals. Overall, NOS family evolution was the result of multiple gene and genome duplication events together with changes in protein architecture.
TL;DR: The evolution of the IGF binding protein (IGFBP) gene family has been difficult to resolve as mentioned in this paper and both chromosomal and serial duplications have been suggested as mechanisms for the expansion of this gene family.
Abstract: The evolution of the IGF binding protein (IGFBP) gene family has been difficult to resolve. Both chromosomal and serial duplications have been suggested as mechanisms for the expansion of this gene family. We have identified and annotated IGFBP sequences from a wide selection of vertebrate species as well as Branchiostoma floridae and Ciona intestinalis. By combining detailed sequence analysis with sequence-based phylogenies and chromosome information, we arrive at the following scenario: the ancestral chordate IGFBP gene underwent a local gene duplication, resulting in a gene pair adjacent to a HOX cluster. Subsequently, the gene family expanded in the two basal vertebrate tetraploidization (2R) resulting in the six IGFBP types that are presently found in placental mammals. The teleost fish ancestor underwent a third tetraploidization (3R) that further expanded the IGFBP repertoire. The five sequenced teleost fish genomes retain 9-11 of IGFBP genes. This scenario is supported by the phylogenies of three adjacent gene families in the HOX gene regions, namely the epidermal growth factor receptors (EGFR) and the Ikaros and distal-less (DLX) transcription factors. Our sequence comparisons show that several important structural components in the IGFBPs are ancestral vertebrate features that have been maintained in all orthologs, for instance the integrin interaction motif Arg-Gly-Asp in IGFBP-2. In contrast, the Arg-Gly-Asp motif in IGFBP-1 has arisen independently in mammals. The large degree of retention of IGFBP genes after the ancient expansion of the gene family strongly suggests that each gene evolved distinct and important functions early in vertebrate evolution.
TL;DR: The results support the findings of recent pangenomic studies that true polysomy 17 is uncommon and they have important potential implications for guiding HER2-targeted therapy in breast cancer.
Abstract: Purpose The ratio of human epidermal growth factor receptor 2 (HER2) to CEP17 by fluorescent in situ hybridization (FISH) with the centromeric probe CEP17 is used to determine HER2 gene status in breast cancer. Increases in CEP17 copy number have been interpreted as representing polysomy 17. However, pangenomic studies have demonstrated that polysomy 17 is rare. This study tests the hypothesis that the use of alternative chromosome 17 reference genes might more accurately assess true HER2 gene status. Patients and Methods In all, 171 patients with breast cancer who had HER2 FISH that had increased mean CEP17 copy numbers (> 2.6) were selected for additional chromosome 17 studies that used probes for Smith-Magenis syndrome (SMS), retinoic acid receptor alpha (RARA), and tumor protein p53 (TP53) genes. A eusomic copy number exhibited in one or more of these loci was used to calculate a revised HER2-to-chromosome-17 ratio by using the eusomic gene locus as the reference. Results Of 132 cases classified as no...
TL;DR: It is shown that yellow-like sequences are present in bacteria, insects, and fungi but absent from other eukaryotes apart from isolated putative sequences in Amphioxus, the Salmon Louse, and Naegleria, and that a highly conserved block of three to five genes has been maintained throughout insect diversification despite extensive genome rearrangements.
Abstract: The yellow gene family is intriguing for a number of reasons. To date, yellow-like genes have only been identified in insect species and a number of bacteria. The function of the yellows is largely unknown, although a few have been associated with melanization and behavior in Drosophila, and a unique clade of genes from Apis mellifera may be involved in caste specification. Here, we show that yellow-like sequences are present in bacteria, insects, and fungi but absent from other eukaryotes apart from isolated putative sequences in Amphioxus, the Salmon Louse, and Naegleria. The yellow-like family forms a discrete gene class characterized by the presence of a major royal jelly protein domain, but eukaryote yellow-like proteins are not monophyletic. The unusual phylogenetic distribution of yellow-like sequences suggests either multiple horizontal transfer from bacteria into eukaryotes or extensive gene loss in eukaryote lineages. Comparative analysis of yellow family synteny and gene order demonstrates that a highly conserved block of three to five genes has been maintained throughout insect diversification despite extensive genome rearrangements. We show strong purifying selection on seven yellow genes over approximately 100 My separating the silkmoth and Heliconius butterflies and an association between spatial regulation of gene expression and distribution of melanic pigment in the developing butterfly wing. A single ancestral yellow-like gene has therefore undergone multiple rounds of duplication within the insects accompanied by functional constraint on both genomic location and protein evolution.
TL;DR: Analysis of genomic DNA of children with autism and healthy controls for rare CNVs suggests that for some genes affected by CNVs in autism, reduced transcript expression may be a mechanism of pathogenesis during neurodevelopment.
Abstract: Individuals with autism are more likely to carry rare inherited and de novo copy number variants (CNVs) However, further research is needed to establish which CNVs are causal and the mechanisms by which these CNVs influence autism We examined genomic DNA of children with autism (N=41) and healthy controls (N=367) for rare CNVs using a high-resolution array comparative genomic hybridization platform We show that individuals with autism are more likely to harbor rare CNVs as small as ∼10 kb, a threshold not previously detectable, and that CNVs in cases disproportionately affect genes involved in transcription, nervous system development, and receptor activity We also show that a subset of genes that have known or suspected allele-specific or imprinting effects and are within rare-case CNVs may undergo loss of transcript expression In particular, expression of CNTNAP2 and ZNF214 are decreased in probands compared with their unaffected transmitting parents Furthermore, expression of PRODH and ARID1B, two genes affected by de novo CNVs, are decreased in probands compared with controls These results suggest that for some genes affected by CNVs in autism, reduced transcript expression may be a mechanism of pathogenesis during neurodevelopment
TL;DR: The data expand the spectrum of the clinical findings in patients with these genomic abnormalities and provide further support for the pathogenic involvement of this duplication in patients who carry them.
Abstract: The chromosome 16p13.11 heterozygous deletion is associated with a diverse array of neuropsychiatric disorders including intellectual disabilities, autism, schizophrenia, epilepsy and attention-deficit hyperactivity disorder. However the clinical significance of its reciprocal duplication is not clearly defined yet. We evaluated 1645 consecutive pediatric patients with various developmental disorders by high-resolution microarray-based comparative genomic hybridization and identified four deletions and eight duplications within the 16p13.11 region, representing ∼0.73% (12/1645) of the patients analyzed. Recurrent clinical features in these patients include mental retardation/intellectual disability, autism, seizure, dysmorphic feature or multiple congenital anomalies. Our data expand the spectrum of the clinical findings in patients with these genomic abnormalities and provide further support for the pathogenic involvement of this duplication in patients who carry them.
TL;DR: It is argued that an expression independent of an external stimulus, such as diet induced activity, emerged as a novel function in vertebrate ancestry allocated to the SCD5 isoform in various tissues (e.g. brain and pancreas), and it was selectively maintained throughout vertebrate evolution.
Abstract: Stearoyl-CoA desaturases (SCDs) are key enzymes involved in de novo monounsaturated fatty acid synthesis. They catalyze the desaturation of saturated fatty acyl-CoA substrates at the delta-9 position, generating essential components of phospholipids, triglycerides, cholesterol esters and wax esters. Despite being crucial for interpreting SCDs roles across species, the evolutionary history of the SCD gene family in vertebrates has yet to be elucidated, in particular their isoform diversity, origin and function. This work aims to contribute to this fundamental effort. We show here, through comparative genomics and phylogenetics that the SCD gene family underwent an unexpectedly complex history of duplication and loss events. Paralogy analysis hints that SCD1 and SCD5 genes emerged as part of the whole genome duplications (2R) that occurred at the stem of the vertebrate lineage. The SCD1 gene family expanded in rodents with the parallel loss of SCD5 in the Muridae family. The SCD1 gene expansion is also observed in the Lagomorpha although without the SCD5 loss. In the amphibian Xenopus tropicalis we find a single SCD1 gene but not SCD5, though this could be due to genome incompleteness. In the analysed teleost species no SCD5 is found, while the surrounding SCD5-less locus is conserved in comparison to tetrapods. In addition, the teleost SCD1 gene repertoire expanded to two copies as a result of the teleost specific genome duplication (3R). Finally, we describe clear orthologues of SCD1 and SCD5 in the chondrichthian, Scyliorhinus canicula, a representative of the oldest extant jawed vertebrate clade. Expression analysis in S. canicula shows that whilst SCD1 is ubiquitous, SCD5 is mainly expressed in the brain, a pattern which might indicate an evolutionary conserved function. We conclude that the SCD1 and SCD5 genes emerged as part of the 2R genome duplications. We propose that the evolutionary conserved gene expression between distinct lineages underpins the importance of SCD activity in the brain (and probably the pancreas), in a yet to be defined role. We argue that an expression independent of an external stimulus, such as diet induced activity, emerged as a novel function in vertebrate ancestry allocated to the SCD5 isoform in various tissues (e.g. brain and pancreas), and it was selectively maintained throughout vertebrate evolution.