TL;DR: By deep sequencing of RNA from a variety of normal and malignant human cells, this work suggests that a non-canonical mode of RNA splicing, resulting in a circular RNA isoform, is a general feature of the gene expression program in human cells.
Abstract: Most human pre-mRNAs are spliced into linear molecules that retain the exon order defined by the genomic sequence. By deep sequencing of RNA from a variety of normal and malignant human cells, we found RNA transcripts from many human genes in which the exons were arranged in a non-canonical order. Statistical estimates and biochemical assays provided strong evidence that a substantial fraction of the spliced transcripts from hundreds of genes are circular RNAs. Our results suggest that a non-canonical mode of RNA splicing, resulting in a circular RNA isoform, is a general feature of the gene expression program in human cells.
TL;DR: DEXSeq is presented, a statistical method to test for differential exon usage in RNA-seq data that uses generalized linear models and offers reliable control of false discoveries by taking biological variation into account.
Abstract: RNA-seq is a powerful tool for the study of alternative splicing and other forms of alternative isoform expression. Understanding the regulation of these processes requires sensitive and specific detection of differential isoform abundance in comparisons between conditions, cell types, or tissues. We present DEXSeq, a statistical method to test for differential exon usage in RNA-seq data. DEXSeq uses generalized linear models and offers reliable control of false discoveries by taking biological variation into account. DEXSeq detects with high sensitivity genes, and in many cases exons, that are subject to differential exon usage. We demonstrate the versatility of DEXSeq by applying it to several data sets. The method facilitates the study of regulation and function of alternative exon usage on a genome-wide scale. An implementation of DEXSeq is available as an R/Bioconductor package.
TL;DR: The findings suggest that the evolution of alternative splicing has for the most part been very rapid and thatAlternative splicing patterns of most organs more strongly reflect the identity of the species rather than the organ type, with the highest complexity in primates.
Abstract: How species with similar repertoires of protein-coding genes differ so markedly at the phenotypic level is poorly understood. By comparing organ transcriptomes from vertebrate species spanning ~350 million years of evolution, we observed significant differences in alternative splicing complexity between vertebrate lineages, with the highest complexity in primates. Within 6 million years, the splicing profiles of physiologically equivalent organs diverged such that they are more strongly related to the identity of a species than they are to organ type. Most vertebrate species-specific splicing patterns are cis-directed. However, a subset of pronounced splicing changes are predicted to remodel protein interactions involving trans-acting regulators. These events likely further contributed to the diversification of splicing and other transcriptomic changes that underlie phenotypic differences among vertebrate species.
TL;DR: While tissue-specific gene expression programs are largely conserved, alternative splicing is well conserved in only a subset of tissues and is frequently lineage-specific, but the extent of splicing conservation is not clear.
Abstract: Most mammalian genes produce multiple distinct messenger RNAs through alternative splicing, but the extent of splicing conservation is not clear. To assess tissue-specific transcriptome variation across mammals, we sequenced complementary DNA from nine tissues from four mammals and one bird in biological triplicate, at unprecedented depth. We find that while tissue-specific gene expression programs are largely conserved, alternative splicing is well conserved in only a subset of tissues and is frequently lineage-specific. Thousands of previously unknown, lineage-specific, and conserved alternative exons were identified; widely conserved alternative exons had signatures of binding by MBNL, PTB, RBFOX, STAR, and TIA family splicing factors, implicating them as ancestral mammalian splicing regulators. Our data also indicate that alternative splicing often alters protein phosphorylatability, delimiting the scope of kinase signaling.
TL;DR: H19’s main physiological role is in limiting growth of the placenta before birth, by regulated processing of miR-675, which may also allow rapid inhibition of cell proliferation in response to cellular stress or oncogenic signals.
Abstract: The H19 large intergenic non-coding RNA (lincRNA) is one of the most highly abundant and conserved transcripts in mammalian development, being expressed in both embryonic and extra-embryonic cell lineages, yet its physiological function is unknown. Here we show that miR-675, a microRNA (miRNA) embedded in H19's first exon, is expressed exclusively in the placenta from the gestational time point when placental growth normally ceases, and placentas that lack H19 continue to grow. Overexpression of miR-675 in a range of embryonic and extra-embryonic cell lines results in their reduced proliferation; targets of the miRNA are upregulated in the H19 null placenta, including the growth-promoting insulin-like growth factor 1 receptor (Igf1r) gene. Moreover, the excision of miR-675 from H19 is dynamically regulated by the stress-response RNA-binding protein HuR. These results suggest that H19's main physiological role is in limiting growth of the placenta before birth, by regulated processing of miR-675. The controlled release of miR-675 from H19 may also allow rapid inhibition of cell proliferation in response to cellular stress or oncogenic signals.
TL;DR: Weber et al. as mentioned in this paper showed that alternative splicing is a widespread mechanism which increases transcriptome and proteome complexity and controls developmental programs and responses to the environment in higher eukaryotes.
Abstract: Alternative splicing (AS) is a widespread mechanism which increases transcriptome and proteome complexity and controls developmental programs and responses to the environment in higher eukaryotes. The splicing process, removal of introns and ligation of exons, is performed by a large RNA-protein complex, the spliceosome, consisting of five small nuclear RNAs (snRNAs) and about 180 proteins with different functions (Wahl et al. 2009). Assembly of the spliceosome on introns in a precursor messenger RNA (pre-mRNA) is directed by cis elements and trans-acting factors (Black 2003; Stamm et al. 2005). The cis sequences include the splice sites, branchpoint, and polypyrimidine tract which have degenerate consensus sequences in higher eukaryotes. While many splice sites are selected in all transcripts (constitutive splicing), others are used to various levels, resulting in alternative transcripts. Selection of such alternative splice sites is affected by auxiliary cis elements located within exonic and intronic sequences, termed splicing enhancers and silencers. These elements are binding sites for trans-acting splicing factors, for example, hnRNP and SR proteins. These proteins, in addition to their functions in constitutive splicing, play a key role in AS by inhibition or promotion of selection of particular splice sites. The presence and abundance of different splicing factors in different cell types, tissues, developmental stages, and environmental conditions determines the AS profiles of expressed genes and ultimately shapes the transcriptome. In addition, alternative transcripts can code for protein isoforms with altered amino acid and domain composition affecting their activity, interaction capacity, localization, and stability, thus affecting the proteome (Stamm et al. 2005).
Alternative splicing was first described in 1977 as peculiar rearrangements in the adenovirus type 2 mRNA (Berget et al. 1977; Chow et al. 1977). Since the discovery of the first example of AS in an endogenous mammalian gene coding for calcitonin (Rosenfeld et al. 1981), the alignment of expressed sequence tag (EST) contigs to genomic DNA allowed the identification of a large number (∼35%) of alternatively spliced genes in humans (Mironov et al. 1999). Estimates of AS in many different organisms have been made using EST/cDNA libraries (Okazaki et al. 2002; Zavolan et al. 2003; Iida et al. 2004; Cusack and Wolfe 2005; Wakamatsu et al. 2009). With the advent of tiling arrays and high-throughput sequencing, the number of genes which undergo AS has continued to increase (Jones-Rhoades et al. 2007; Weber et al. 2007; Kwan et al. 2008; Mortazavi et al. 2008; Pan et al. 2008). In particular, the application of high-throughput sequencing to transcriptomes (RNA-seq) has now demonstrated that AS occurs in ∼95% of intron-containing genes in human (Pan et al. 2008).
In plants, estimates of the occurrence of AS have been hampered by a low number of ESTs (Brett et al. 2002). However, the levels of AS have continued to increase with greater EST/cDNA coverage: 1.2% (Zhu et al. 2003), 5% (Zhu et al. 2003), 11.6% (Iida et al. 2004), 21.8% (Wang and Brendel 2006), 29% (Xiao et al. 2005), and >30% (Campbell et al. 2006). Many transcriptome studies using high-throughput sequencing have been performed in plants, but few have been used to examine AS (Weber et al. 2007; Filichkin et al. 2010; Lu et al. 2010; Zhang et al. 2010). The most recent estimate based on RNA-seq is that ∼42% of Arabidopsis intron-containing genes undergo AS (Filichkin et al. 2010).
In terms of identifying AS in plants, the expression profile itself influences the representation of many transcripts in databases. For example, an Arabidopsis transcriptome study using 454 Life Sciences (Roche) sequencing (Weber et al. 2007) showed that the top 10 most highly expressed genes represent 25% of the total mapped reads, thus tremendously compromising the representation of less abundant transcripts. To improve gene representation and discovery of AS events in Arabidopsis, we have used RNA-seq of a normalized cDNA library made from Arabidopsis seedlings and flowers. We have shown that normalization significantly increases the coverage of reads across the genes, and we have identified a large number (∼47 k) of new splice junctions. Taking advantage of a high-resolution RT-PCR panel (Simpson et al. 2008a,b), we were able to validate many novel AS events. Altogether, our results show that at least 61% of intron-containing genes are alternatively spliced under normal growth conditions, which indicates a high complexity of the Arabidopsis transcriptome.
TL;DR: For example, this paper found that while tissue-specific gene expression programs are largely conserved, alternative splicing is well conserved in only a subset of tissues and is frequently lineage-specific.
Abstract: Most mammalian genes produce multiple distinct messenger RNAs through alternative splicing, but the extent of splicing conservation is not clear. To assess tissue-specific transcriptome variation across mammals, we sequenced complementary DNA from nine tissues from four mammals and one bird in biological triplicate, at unprecedented depth. We find that while tissue-specific gene expression programs are largely conserved, alternative splicing is well conserved in only a subset of tissues and is frequently lineage-specific. Thousands of previously unknown, lineage-specific, and conserved alternative exons were identified; widely conserved alternative exons had signatures of binding by MBNL, PTB, RBFOX, STAR, and TIA family splicing factors, implicating them as ancestral mammalian splicing regulators. Our data also indicate that alternative splicing often alters protein phosphorylatability, delimiting the scope of kinase signaling.
TL;DR: The first large scale RNA sequencing study of lung adenocarcinoma is presented, demonstrating its power to identify somatic point mutations as well as transcriptional variants such as gene fusions, alternative splicing events, and expression outliers.
Abstract: All cancers harbor molecular alterations in their genomes. The transcriptional consequences of these somatic mutations have not yet been comprehensively explored in lung cancer. Here we present the first large scale RNA sequencing study of lung adenocarcinoma, demonstrating its power to identify somatic point mutations as well as transcriptional variants such as gene fusions, alternative splicing events, and expression outliers. Our results reveal the genetic basis of 200 lung adenocarcinomas in Koreans including deep characterization of 87 surgical specimens by transcriptome sequencing. We identified driver somatic mutations in cancer genes including EGFR, KRAS, NRAS, BRAF, PIK3CA, MET, and CTNNB1. Candidates for novel driver mutations were also identified in genes newly implicated in lung adenocarcinoma such as LMTK2, ARID1A, NOTCH2, and SMARCA4. We found 45 fusion genes, eight of which were chimeric tyrosine kinases involving ALK, RET, ROS1, FGFR2, AXL, and PDGFRA. Among 17 recurrent alternative splicing events, we identified exon 14 skipping in the proto-oncogene MET as highly likely to be a cancer driver. The number of somatic mutations and expression outliers varied markedly between individual cancers and was strongly correlated with smoking history of patients. We identified genomic blocks within which gene expression levels were consistently increased or decreased that could be explained by copy number alterations in samples. We also found an association between lymph node metastasis and somatic mutations in TP53. These findings broaden our understanding of lung adenocarcinoma and may also lead to new diagnostic and therapeutic approaches.
TL;DR: In this paper, the structural, molecular, and clinical implications of EGFR exon 20 insertions were reviewed and an update with an emphasis on the structural and molecular implications of these insertions was provided.
Abstract: Summary Lung cancer is the leading cause of cancer-related death. The identification of epidermal growth factor receptor (EGFR) somatic mutations defined a new, molecularly classified subgroup of non-small-cell lung cancer (NSCLC). Classic EGFR activating mutations, such as inframe deletions in exon 19 or the Leu858Arg (L858R) point mutation in exon 21 are associated with sensitivity to first generation quinazoline reversible EGFR tyrosine kinase inhibitors (TKIs). EGFR exon 20 insertion mutations, which are typically located after the C-helix of the tyrosine kinase domain of EGFR, may account for up to 4% of all EGFR mutations. Preclinical models have shown that the most prevalent EGFR exon 20 insertion mutated proteins are resistant to clinically achievable doses of reversible (gefitinib, erlotinib) and irreversible (neratinib, afatinib, PF00299804) EGFR TKIs. Growing clinical experience with patients whose tumours harbour EGFR exon 20 insertions corresponds with the preclinical data; very few patients have had responses to EGFR TKIs. Despite the prevalence and biological importance of EGFR exon 20 insertions, few reports have summarised all preclinical and clinical data on these mutations. Here, we review the literature and provide an update with an emphasis on the structural, molecular, and clinical implications of EGFR exon 20 insertions.
TL;DR: The muscle-blind-like (Mbnl) family of RNA-binding proteins plays important roles in muscle and eye development and in myotonic dystrophy (DM), in which expanded CUG or CCUG repeats functionally deplete Mbnl proteins as mentioned in this paper.
TL;DR: MATS (multivariate analysis of transcript splicing), a Bayesian statistical framework for flexible hypothesis testing of differential alternative splicing patterns on RNA-Seq data, is developed and demonstrated that MATS is an effective and flexible approach for detecting differential alternativesplicing from RNA- Seq data.
Abstract: Ultra-deep RNA sequencing has become a powerful approach for genome-wide analysis of pre-mRNA alternative splicing. We develop MATS (multivariate analysis of transcript splicing), a bayesian statistical framework for flexible hypothesis testing of differential alternative splicing patterns on RNA-Seq data. MATS uses a multivariate uniform prior to model the between-sample correlation in exon splicing patterns, and a Markov chain Monte Carlo (MCMC) method coupled with a simulation-based adaptive sampling procedure to calculate the P-value and false discovery rate (FDR) of differential alternative splicing. Importantly, the MATS approach is applicable to almost any type of null hypotheses of interest, providing the flexibility to identify differential alternative splicing events that match a given user-defined pattern. We evaluated the performance of MATS using simulated and real RNA-Seq data sets. In the RNA-Seq analysis of alternative splicing events regulated by the epithelial-specific splicing factor ESRP1, we obtained a high RT-PCR validation rate of 86% for differential exon skipping events with a MATS FDR of <10%. Additionally, over the full list of RT-PCR tested exons, the MATS FDR estimates matched well with the experimental validation rate. Our results demonstrate that MATS is an effective and flexible approach for detecting differential alternative splicing from RNA-Seq data.
TL;DR: There is no indication that any prokaryote has ever possessed a spliceosome or introns in protein-coding genes, other than relatively rare mobile self-splicing introns, and introns were a major factor of evolution throughout the history of eukaryotes.
Abstract: Evolution of exon-intron structure of eukaryotic genes has been a matter of long-standing, intensive debate. The introns-early concept, later rebranded ‘introns first’ held that protein-coding genes were interrupted by numerous introns even at the earliest stages of life's evolution and that introns played a major role in the origin of proteins by facilitating recombination of sequences coding for small protein/peptide modules. The introns-late concept held that introns emerged only in eukaryotes and new introns have been accumulating continuously throughout eukaryotic evolution. Analysis of orthologous genes from completely sequenced eukaryotic genomes revealed numerous shared intron positions in orthologous genes from animals and plants and even between animals, plants and protists, suggesting that many ancestral introns have persisted since the last eukaryotic common ancestor (LECA). Reconstructions of intron gain and loss using the growing collection of genomes of diverse eukaryotes and increasingly advanced probabilistic models convincingly show that the LECA and the ancestors of each eukaryotic supergroup had intron-rich genes, with intron densities comparable to those in the most intron-rich modern genomes such as those of vertebrates. The subsequent evolution in most lineages of eukaryotes involved primarily loss of introns, with only a few episodes of substantial intron gain that might have accompanied major evolutionary innovations such as the origin of metazoa. The original invasion of self-splicing Group II introns, presumably originating from the mitochondrial endosymbiont, into the genome of the emerging eukaryote might have been a key factor of eukaryogenesis that in particular triggered the origin of endomembranes and the nucleus. Conversely, splicing errors gave rise to alternative splicing, a major contribution to the biological complexity of multicellular eukaryotes. There is no indication that any prokaryote has ever possessed a spliceosome or introns in protein-coding genes, other than relatively rare mobile self-splicing introns. Thus, the introns-first scenario is not supported by any evidence but exon-intron structure of protein-coding genes appears to have evolved concomitantly with the eukaryotic cell, and introns were a major factor of evolution throughout the history of eukaryotes. This article was reviewed by I. King Jordan, Manuel Irimia (nominated by Anthony Poole), Tobias Mourier (nominated by Anthony Poole), and Fyodor Kondrashov. For the complete reports, see the Reviewers’ Reports section.
TL;DR: It is argued that the increased presence of SR and hnRNP proteins promoted the evolution of alternative splicing through relaxation of the sequence requirements of splice junctions.
Abstract: The splicing of pre-mRNAs is an essential step of gene expression in eukaryotes. Introns are removed from split genes through the activities of the spliceosome, a large ribonuclear machine that is conserved throughout the eukaryotic lineage. While unicellular eukaryotes are characterized by less complex splicing, pre-mRNA splicing of multicellular organisms is often associated with extensive alternative splicing that significantly enriches their proteome. The alternative selection of splice sites and exons permits multicellular organisms to modulate gene expression patterns in a cell type-specific fashion, thus contributing to their functional diversification. Alternative splicing is a regulated process that is mainly influenced by the activities of splicing regulators, such as SR proteins or hnRNPs. These modular factors have evolved from a common ancestor through gene duplication events to a diverse group of splicing regulators that mediate exon recognition through their sequence-specific binding to pre-mRNAs. Given the strong correlations between intron expansion, the complexity of pre-mRNA splicing, and the emergence of splicing regulators, it is argued that the increased presence of SR and hnRNP proteins promoted the evolution of alternative splicing through relaxation of the sequence requirements of splice junctions.
TL;DR: It is shown that the association of nsp10 with nsp14 stimulates >35-fold the ExoN activity of the latter while playing no effect on N7-MTase activity, which indicates an RNA processing function potentially connected to a replicative mismatch repair mechanism.
Abstract: The replication/transcription complex of severe acute respiratory syndrome coronavirus is composed of at least 16 nonstructural proteins (nsp1–16) encoded by the ORF-1a/1b. This complex includes replication enzymes commonly found in positive-strand RNA viruses, but also a set of RNA-processing activities unique to some nidoviruses. The nsp14 protein carries both exoribonuclease (ExoN) and (guanine-N7)-methyltransferase (N7-MTase) activities. The nsp14 ExoN activity ensures a yet-uncharacterized function in the virus life cycle and must be regulated to avoid nonspecific RNA degradation. In this work, we show that the association of nsp10 with nsp14 stimulates >35-fold the ExoN activity of the latter while playing no effect on N7-MTase activity. Nsp10 mutants unable to interact with nsp14 are not proficient for ExoN activation. The nsp10/nsp14 complex hydrolyzes double-stranded RNA in a 3′ to 5′ direction as well as a single mismatched nucleotide at the 3′-end mimicking an erroneous replication product. In contrast, di-, tri-, and longer unpaired ribonucleotide stretches, as well as 3′-modified RNAs, resist nsp10/nsp14-mediated excision. In addition to the activation of nsp16-mediated 2′-O-MTase activity, nsp10 also activates nsp14 in an RNA processing function potentially connected to a replicative mismatch repair mechanism.
TL;DR: The PWWP domain of the chromatin-associated protein Psip1/Ledgf can specifically recognize tri-methylated H3K36 and that, like this histone modification, thePsip1 short (p52) isoform is enriched at active genes.
Abstract: Increasing evidence suggests that chromatin modifications have important roles in modulating constitutive or alternative splicing. Here we demonstrate that the PWWP domain of the chromatin-associated protein Psip1/Ledgf can specifically recognize tri-methylated H3K36 and that, like this histone modification, the Psip1 short (p52) isoform is enriched at active genes. We show that the p52, but not the long (p75), isoform of Psip1 co-localizes and interacts with Srsf1 and other proteins involved in mRNA processing. The level of H3K36me3 associated Srsf1 is reduced in Psip1 mutant cells and alternative splicing of specific genes is affected. Moreover, we show altered Srsf1 distribution around the alternatively spliced exons of these genes in Psip1 null cells. We propose that Psip1/p52, through its binding to both chromatin and splicing factors, might act to modulate splicing.
TL;DR: It is reported that Mbnl2 knockout mice develop several DM-associated central nervous system (CNS) features including abnormal REM sleep propensity and deficits in spatial memory.
TL;DR: An unbiased, genome‐wide bioinformatic screen for gene fusions using Affymetrix Exon array expression data and the novel HEY1‐NCOA2 fusion appears to be the defining and diagnostic gene fusion in mesenchymal chondrosarcomas.
Abstract: Cancer gene fusions that encode a chimeric protein are often characterized by an intragenic discontinuity in the RNA\expression levels of the exons that are 5' or 3' to the fusion point in one or both of the fusion partners due to differences in the levels of activation of their respective promoters. Based on this, we developed an unbiased, genome-wide bioinformatic screen for gene fusions using Affymetrix Exon array expression data. Using a training set of 46 samples with different known gene fusions, we developed a data analysis pipeline, the "Fusion Score (FS) model", to score and rank genes for intragenic changes in expression. In a separate discovery set of 41 tumor samples with possible unknown gene fusions, the FS model generated a list of 552 candidate genes. The transcription factor gene NCOA2 was one of the candidates identified in a mesenchymal chondrosarcoma. A novel HEY1-NCOA2 fusion was identified by 5' RACE, representing an in-frame fusion of HEY1 exon 4 to NCOA2 exon 13. RT-PCR or FISH evidence of this HEY1-NCOA2 fusion was present in all additional mesenchymal chondrosarcomas tested with a definitive histologic diagnosis and adequate material for analysis (n = 9) but was absent in 15 samples of other subtypes of chondrosarcomas. We also identified a NUP107-LGR5 fusion in a dedifferentiated liposarcoma but analysis of 17 additional samples did not confirm it as a recurrent event in this sarcoma type. The novel HEY1-NCOA2 fusion appears to be the defining and diagnostic gene fusion in mesenchymal chondrosarcomas.
TL;DR: It is found that both FUS and TDP-43 regulate genes that function in neuronal development, and a saw-tooth binding pattern in long genes demonstrated that FUS remains bound to pre-mRNAs until splicing is completed.
Abstract: Fused in sarcoma (FUS) and TAR DNA-binding protein 43 (TDP-43) are RNA-binding proteins pathogenetically linked to amyotrophic lateral sclerosis (ALS) and frontotemporal lobar degeneration (FTLD), but it is not known if they regulate the same transcripts. We addressed this question using crosslinking and immunoprecipitation (iCLIP) in mouse brain, which showed that FUS binds along the whole length of the nascent RNA with limited sequence specificity to GGU and related motifs. A saw-tooth binding pattern in long genes demonstrated that FUS remains bound to pre-mRNAs until splicing is completed. Analysis of FUS−/− brain demonstrated a role for FUS in alternative splicing, with increased crosslinking of FUS in introns around the repressed exons. We did not observe a significant overlap in the RNA binding sites or the exons regulated by FUS and TDP-43. Nevertheless, we found that both proteins regulate genes that function in neuronal development.
TL;DR: The observations from the Drosophila model point toward an evolutionarily conserved role of RNA methylation in normal cognitive development, suggesting that mutations in this gene might even induce a syndromic form of ID.
Abstract: With a prevalence between 1 and 3%, hereditary forms of intellectual disability (ID) are among the most important problems in health care. Particularly, autosomal-recessive forms of the disorder have a very heterogeneous molecular basis, and genes with an increased number of disease-causing mutations are not common. Here, we report on three different mutations (two nonsense mutations, c.679C>T [p.Gln227∗] and c.1114C>T [p.Gln372∗], as well as one splicing mutation, g.6622224A>C [p.Ile179Argfs∗192]) that cause a loss of the tRNA-methyltransferase-encoding NSUN2 main transcript in homozygotes. We identified the mutations by sequencing exons and exon-intron boundaries within the genomic region where the linkage intervals of three independent consanguineous families of Iranian and Kurdish origin overlapped with the previously described MRT5 locus. In order to gain further evidence concerning the effect of a loss of NSUN2 on memory and learning, we constructed a Drosophila model by deleting the NSUN2 ortholog, CG6133, and investigated the mutants by using molecular and behavioral approaches. When the Drosophila melanogaster NSUN2 ortholog was deleted, severe short-term-memory (STM) deficits were observed; STM could be rescued by re-expression of the wild-type protein in the nervous system. The humans homozygous for NSUN2 mutations showed an overlapping phenotype consisting of moderate to severe ID and facial dysmorphism (which includes a long face, characteristic eyebrows, a long nose, and a small chin), suggesting that mutations in this gene might even induce a syndromic form of ID. Moreover, our observations from the Drosophila model point toward an evolutionarily conserved role of RNA methylation in normal cognitive development.
TL;DR: A new role for 5-hmC in RNA splicing and synaptic function in the brain is suggested as well as substantial tissue-specific differential distributions of these DNA modifications at the exon-intron boundary in human and mouse tissues.
Abstract: The 5-methylcytosine (5-mC) derivative 5-hydroxymethylcytosine (5-hmC) is abundant in the brain for unknown reasons. Here we characterize the genomic distribution of 5-hmC and 5-mC in human and mouse tissues. We assayed 5-hmC by using glucosylation coupled with restriction-enzyme digestion and microarray analysis. We detected 5-hmC enrichment in genes with synapse-related functions in both human and mouse brain. We also identified substantial tissue-specific differential distributions of these DNA modifications at the exon-intron boundary in human and mouse. This boundary change was mainly due to 5-hmC in the brain but due to 5-mC in non-neural contexts. This pattern was replicated in multiple independent data sets and with single-molecule sequencing. Moreover, in human frontal cortex, constitutive exons contained higher levels of 5-hmC relative to alternatively spliced exons. Our study suggests a new role for 5-hmC in RNA splicing and synaptic function in the brain.
TL;DR: In this paper, the authors found that postsynaptic density protein 95 (PSD-95) was controlled post-transcriptionally during neural development by two PTB proteins whose sequential downregulation is necessary for synapse maturation.
Abstract: Postsynaptic density protein 95 (PSD-95) is essential for synaptic maturation and plasticity. Although its synaptic regulation has been widely studied, the control of PSD-95 cellular expression is not understood. We found that Psd-95 was controlled post-transcriptionally during neural development. Psd-95 was transcribed early in mouse embryonic brain, but most of its product transcripts were degraded. The polypyrimidine tract binding proteins PTBP1 and PTBP2 repressed Psd-95 (also known as Dlg4) exon 18 splicing, leading to premature translation termination and nonsense-mediated mRNA decay. The loss of first PTBP1 and then of PTBP2 during embryonic development allowed splicing of exon 18 and expression of PSD-95 late in neuronal maturation. Re-expression of PTBP1 or PTBP2 in differentiated neurons inhibited PSD-95 expression and impaired the development of glutamatergic synapses. Thus, expression of PSD-95 during early neural development is controlled at the RNA level by two PTB proteins whose sequential downregulation is necessary for synapse maturation.
TL;DR: Coexistence of PIK3CA (the PI3K p110α subunit) exon 9 and 20 mutations, but not Pik3CA mutation in either exon 7 or 20 alone, is associated with poor prognosis of colorectal cancer patients.
TL;DR: DNMT1 is a widely expressed DNA methyltransferase maintaining methylation patterns in development, and mediating transcriptional repression by direct binding to HDAC2, also highly expressed in immune cells and required for the differentiation of CD4+ into T regulatory cells.
Abstract: Autosomal dominant cerebellar ataxia, deafness and narcolepsy (ADCA-DN) is characterized by late onset (30-40 years old) cerebellar ataxia, sensory neuronal deafness, narcolepsy-cataplexy and dementia. We performed exome sequencing in five individuals from three ADCA-DN kindreds and identified DNMT1 as the only gene with mutations found in all five affected individuals. Sanger sequencing confirmed the de novo mutation p.Ala570Val in one family, and showed co-segregation of p.Val606Phe and p.Ala570Val, with the ADCA-DN phenotype, in two other kindreds. An additional ADCA-DN kindred with a p.GLY605Ala mutation was subsequently identified. Narcolepsy and deafness were the first symptoms to appear in all pedigrees, followed by ataxia. DNMT1 is a widely expressed DNA methyltransferase maintaining methylation patterns in development, and mediating transcriptional repression by direct binding to HDAC2. It is also highly expressed in immune cells and required for the differentiation of CD4+ into T regulatory cells. Mutations in exon 20 of this gene were recently reported to cause hereditary sensory neuropathy with dementia and hearing loss (HSAN1). Our mutations are all located in exon 21 and in very close spatial proximity, suggesting distinct phenotypes depending on mutation location within this gene.
TL;DR: It is demonstrated that engineered inactivation of severe acute respiratory syndrome-CoV ExoN activity results in a stable mutator phenotype with profoundly decreased fidelity in vivo and attenuation of pathogenesis in young, aged and immunocompromised mice.
Abstract: Live, attenuated RNA virus vaccines are efficacious but subject to reversion to virulence. Among RNA viruses, replication fidelity is recognized as a key determinant of virulence and escape from antiviral therapy; increased fidelity is attenuating for some viruses. Coronavirus (CoV) replication fidelity is approximately 20-fold greater than that of other RNA viruses and is mediated by a 3'→5' exonuclease (ExoN) activity that probably functions in RNA proofreading. In this study we demonstrate that engineered inactivation of severe acute respiratory syndrome (SARS)-CoV ExoN activity results in a stable mutator phenotype with profoundly decreased fidelity in vivo and attenuation of pathogenesis in young, aged and immunocompromised mice. The ExoN inactivation genotype and mutator phenotype are stable and do not revert to virulence, even after serial passage or long-term persistent infection in vivo. ExoN inactivation has potential for broad applications in the stable attenuation of CoVs and, perhaps, other RNA viruses.
TL;DR: The findings establish NSUN2 as the first causal gene with relationship to the DS spectrum phenotype, which might help explain the varied clinical presentation in DS that can include chromosomal instability and immunological defects.
Abstract: Background Dubowitz syndrome (DS) is an autosomal recessive disorder characterized by the constellation of mild microcephaly, growth and mental retardation, eczema and peculiar facies. Over 140 cases have been reported, but the genetic basis is not understood.
Methods We enrolled a multiplex consanguineous family from the United Arab Emirates with many of the key clinical features of DS as reported in previous series. The family was analyzed by whole exome sequencing. RNA splicing was evaluated with reverse-transcriptase PCR, immunostaining and western blotting was performed with specific antibodies, and site-specific cytosine-5-methylation was studied with bisulfite sequencing.
Results We identified a homozygous splice mutation in the NSUN2 gene, encoding a conserved RNA methyltransferase. The mutation abolished the canonical splice acceptor site of exon 6, leading to use of a cryptic splice donor within an AluY and subsequent mRNA instability. Patient cells lacked NSUN2 protein and there was resultant loss of site-specific 5-cytosine methylation of the tRNA(Asp GTC) at C47 and C48, known NSUN2 targets.
Conclusion Our findings establish NSUN2 as the first causal gene with relationship to the DS spectrum phenotype. NSUN2 has been implicated in Myc-induced cell proliferation and mitotic spindle stability, which might help explain the varied clinical presentation in DS that can include chromosomal instability and immunological defects.
TL;DR: The analysis revealed that FUS regulates alternative splicing events and transcriptions in a position-dependent manner and binding of FUS to the promoter antisense strand downregulates transcriptions of the coding strand.
Abstract: FUS is an RNA-binding protein that regulates transcription, alternative splicing, and mRNA transport. Aberrations of FUS are causally associated with familial and sporadic ALS/FTLD. We analyzed FUS-mediated transcriptions and alternative splicing events in mouse primary cortical neurons using exon arrays. We also characterized FUS-binding RNA sites in the mouse cerebrum with HITS-CLIP. We found that FUS-binding sites tend to form stable secondary structures. Analysis of position-dependence of FUS-binding sites disclosed scattered binding of FUS to and around the alternatively spliced exons including those associated with neurodegeneration such as Mapt, Camk2a, and Fmr1. We also found that FUS is often bound to the antisense RNA strand at the promoter regions. Global analysis of these FUS-tags and the expression profiles disclosed that binding of FUS to the promoter antisense strand downregulates transcriptions of the coding strand. Our analysis revealed that FUS regulates alternative splicing events and transcriptions in a position-dependent manner.
TL;DR: It is shown that, unlike Rbfox1 deletion, the CNS-specific deletion of RbFox2 disrupts cerebellar development and is required with Rb fox1 to maintain mature neuronal physiology, specifically Purkinje cell pacemaking, through their shared control of sodium channel transcript splicing.
Abstract: Alternative pre-mRNA splicing is an important mechanism for regulating gene expression that contributes greatly to proteomic diversity in eukaryotes (Black 2003; Blencowe 2006; Nilsen and Graveley 2010). Changes in exon inclusion or splice site usage can substantially alter the expression or function of the encoded protein. Alternative splicing is especially prevalent in the mammalian nervous system, where it controls aspects of neural tube patterning, synaptogenesis, and the regulation of membrane physiology, among other important processes (Lipscombe 2005; Licatalosi and Darnell 2006; Li et al. 2007). The choice of splicing pattern is generally controlled by trans-acting RNA-binding proteins that bind to cis-acting elements in the pre-mRNA to enhance or silence particular splicing events (Black 2003; Matlin et al. 2005; Chen and Manley 2009; Nilsen and Graveley 2010). These RNA-binding proteins can be expressed in a temporal- or tissue-specific manner to alter the splicing of a defined set of transcripts. Some of these splicing regulators have been shown to play important roles in the developing and adult mammalian brain (Jensen et al. 2000; Lukong and Richard 2008; Calarco et al. 2009; Yano et al. 2010; Gehman et al. 2011; Raj et al. 2011; Zheng et al. 2012).
In mammals, the RNA-binding Fox (Rbfox) family of splicing regulators is comprised of three members: Rbfox1 (Fox-1 or A2BP1), Rbfox2 (Fox-2 or RBM9), and Rbfox3 (Fox-3, HRNBP3, or NeuN) (Kuroyanagi 2009). Each Fox protein has a single central RNA recognition motif (RRM) RNA-binding domain that recognizes the sequence (U)GCAUG found within introns flanking alternative exons (Jin et al. 2003; Auweter et al. 2006; Ponthier et al. 2006). The position of the (U)GCAUG motif with respect to the alternative exon dictates the effect of the Rbfox proteins on splicing. A motif located downstream from the alternative exon generally promotes Rbfox-dependent exon inclusion, whereas an upstream motif will usually repress inclusion (Huh and Hynes 1994; Modafferi and Black 1997; Jin et al. 2003; Nakahata and Kawamoto 2005; Underwood et al. 2005; Zhang et al. 2008; Kuroyanagi 2009; Yeo et al. 2009). The three mouse Rbfox paralogs show a high degree of sequence conservation, especially within the RNA-binding domain, which is identical between Rbfox1 and Rbfox2 and only slightly altered in Rbfox3 (94% amino acid identity). The N-terminal and C-terminal domains are less similar between the proteins, presumably allowing for different protein–protein interactions. All three Rbfox family members are highly expressed in most neurons of the mature brain, where they regulate the splicing of neuronal transcripts (McKee et al. 2005; Nakahata and Kawamoto 2005; Underwood et al. 2005; Kim et al. 2009; Tang et al. 2009; Hammock and Levitt 2011). Rbfox1 and Rbfox2 have been shown to control a shared set of neuronal-specific target exons, including exon N30 of nonmuscle myosin heavy chain II-B (NMHC-B), exon N1 of c-src, and exons 9* and 33 of the L-type calcium channel Cav1.2 (Nakahata and Kawamoto 2005; Underwood et al. 2005; Tang et al. 2009).
The individual Rbfox family members show differing patterns of expression. Rbfox1 is expressed in neurons, heart, and muscle, while Rbfox3 is limited to neurons (Wolf et al. 1996; Jin et al. 2003; McKee et al. 2005; Underwood et al. 2005; Kim et al. 2009; Damianov and Black 2010). Rbfox2 is expressed in these tissues as well as other cell types, including the embryo, hematopoietic cells, and embryonic stem cells (ESCs) (Underwood et al. 2005; Ponthier et al. 2006; Yeo et al. 2007). Thus, although the Rbfox proteins can regulate many of the same target exons when ectopically expressed, their in vivo targets may differ due to the variable expression of each protein. For example, Rbfox2 controls the developmental-specific splicing of exons in fibroblast growth factor receptor 2 (FGFR2), erythrocyte protein 4.1R, and STE20-like kinase in cells where the other proteins are absent (Baraniak et al. 2006; Ponthier et al. 2006; Yang et al. 2008; Yeo et al. 2009). Rbfox2 is clearly important for splicing regulation during embryonic growth and development, but its role in the brain is less clear.
Defects in alternative splicing can lead to neurological and neuromuscular disease, such as frontotemporal dementia and myotonic dystrophy (Faustino and Cooper 2003; Licatalosi and Darnell 2006; Cooper et al. 2009). The Rbfox proteins have also been linked to neurological conditions. Human mutations in the RBFOX1 (A2BP1) gene can lead to severe disorders, including mental retardation, epilepsy, and autism spectrum disorder (Bhalla et al. 2004; Barnby et al. 2005; Martin et al. 2007; Sebat et al. 2007; Voineagu et al. 2011). Moreover, human RBFOX1 was first identified through an interaction with Ataxin-2, the protein mutated in spinocerebellar ataxia type II (SCAII), and RBFOX2 was later shown to interact with Ataxin-1, which is mutated in SCAI patients (Shibata et al. 2000; Lim et al. 2006). These results imply a role for Rbfox proteins in cerebellar function.
We recently showed that deletion of Rbfox1 results in increased neuronal excitation in the hippocampus and seizures in the mouse, in keeping with its regulation of many gene products important for synaptic transmission (Gehman et al. 2011). Rbfox1 mutation did not lead to obvious cerebellar defects. Interestingly, deletion of Rbfox2 did not produce the same seizure phenotype as Rbfox1 deletion. Thus, while the Rbfox proteins share some target exons in the brain, they are not fully redundant in their functions.
To better understand the roles of Rbfox-mediated splicing regulation in the brain, we created mice with tissue- and cell type-specific deletions of one or more Rbfox proteins. We found that CNS-specific deletion of Rbfox2 results in impaired cerebellar development and additional neurological phenotypes, whereas postnatal deletion from cerebellar Purkinje neurons leads to marked deficits in neuronal excitability and, specifically, pacemaking. Thus, like Rbfox1, Rbfox2 is essential for the proper function of mature neural circuits, but also plays a role in brain development.
TL;DR: Structural alterations in the AR gene are linked to stable gain-of-function splicing alterations in CRPCa, and this study demonstrates that complex patterns of AR gene copy number imbalances occur in PCa cell lines, xenografts and clinical specimens.
Abstract: Reactivation of the androgen receptor (AR) during androgen depletion therapy (ADT) underlies castration-resistant prostate cancer (CRPCa). Alternative splicing of the AR gene and synthesis of constitutively active COOH-terminally truncated AR variants lacking the AR ligand-binding domain has emerged as an important mechanism of ADT resistance in CRPCa. In a previous study, we demonstrated that altered AR splicing in CRPCa 22Rv1 cells was linked to a 35-kb intragenic tandem duplication of AR exon 3 and flanking sequences. In this study, we demonstrate that complex patterns of AR gene copy number imbalances occur in PCa cell lines, xenografts and clinical specimens. To investigate whether these copy number imbalances reflect AR gene rearrangements that could be linked to splicing disruptions, we carried out a detailed analysis of AR gene structure in the LuCaP 86.2 and CWR-R1 models of CRPCa. By deletion-spanning PCR, we discovered a 8579-bp deletion of AR exons 5, 6 and 7 in the LuCaP 86.2 xenograft, which provides a rational explanation for synthesis of the truncated AR v567es AR variant in this model. Similarly, targeted resequencing of the AR gene in CWR-R1 cells led to the discovery of a 48-kb deletion in AR intron 1. This intragenic deletion marked a specific CWR-R1 cell population with enhanced expression of the truncated AR-V7/AR3 variant, a high level of androgen-independent AR transcriptional activity and rapid androgen independent growth. Together, these data demonstrate that structural alterations in the AR gene are linked to stable gain-of-function splicing alterations in CRPCa.
TL;DR: Five potential mechanisms of ETV6-mediated leukemogenesis have been identified: constitutive activation of the kinase activity of the partner protein, modification of the original functions of a transcription factor, loss of function of the fusion gene, and activation of a proto-oncogene in the vicinity of a chromosomal translocation.
TL;DR: This review presents detailed information about the structure of triplet repeat RNA and addresses the simple sequence repeats of normal and expanded lengths in the context of the physiological and pathogenic roles played in human cells.
Abstract: This review presents detailed information about the structure of triplet repeat RNA and addresses the simple sequence repeats of normal and expanded lengths in the context of the physiological and pathogenic roles played in human cells. First, we discuss the occurrence and frequency of various trinucleotide repeats in transcripts and classify them according to the propensity to form RNA structures of different architectures and stabilities. We show that repeats capable of forming hairpin structures are overrepresented in exons, which implies that they may have important functions. We further describe long triplet repeat RNA as a pathogenic agent by presenting human neurological diseases caused by triplet repeat expansions in which mutant RNA gains a toxic function. Prominent examples of these diseases include myotonic dystrophy type 1 and fragile X-associated tremor ataxia syndrome, which are triggered by mutant CUG and CGG repeats, respectively. In addition, we discuss RNA-mediated pathogenesis in polyglutamine disorders such as Huntington's disease and spinocerebellar ataxia type 3, in which expanded CAG repeats may act as an auxiliary toxic agent. Finally, triplet repeat RNA is presented as a therapeutic target. We describe various concepts and approaches aimed at the selective inhibition of mutant transcript activity in experimental therapies developed for repeat-associated diseases.