FDI Lab - SciCrunch.org | Searching for in Literature

Native RNA sequencing in fission yeast reveals frequent alternative splicing isoforms.

José Carlos Montañés‎ et al.
Genome research‎
2022‎

The unicellular yeast Schizosaccharomyces pombe (fission yeast) retains many of the splicing features observed in humans and is thus an excellent model to study the basic mechanisms of splicing. Nearly half the genes contain introns, but the impact of alternative splicing in gene regulation and proteome diversification remains largely unexplored. Here we leverage Oxford Nanopore Technologies native RNA sequencing (dRNA), as well as ribosome profiling data, to uncover the full range of polyadenylated transcripts and translated open reading frames. We identify 332 alternative isoforms affecting the coding sequences of 262 different genes, 97 of which occur at frequencies higher than 20%, indicating that functional alternative splicing in S. pombe is more prevalent than previously suspected. Intron retention events make about 80% of the cases; these events may be involved in the regulation of gene expression and, in some cases, generate novel protein isoforms, as supported by ribosome profiling data in 18 of the intron retention isoforms. One example is the rpl22 gene, in which intron retention is associated with the translation of a protein of only 13 amino acids. We also find that lowly expressed transcripts tend to have longer poly(A) tails than highly expressed transcripts, highlighting an interdependence between poly(A) tail length and transcript expression level. Finally, we discover 214 novel transcripts that are not annotated, including 158 antisense transcripts, some of which also show translation evidence. The methodologies described in this work open new opportunities to study the regulation of splicing in a simple eukaryotic model.

Native molecule sequencing by nano-ID reveals synthesis and stability of RNA isoforms.

Kerstin C Maier‎ et al.
Genome research‎
2020‎

Eukaryotic genes often generate a variety of RNA isoforms that can lead to functionally distinct protein variants. The synthesis and stability of RNA isoforms is poorly characterized because current methods to quantify RNA metabolism use short-read sequencing and cannot detect RNA isoforms. Here we present nanopore sequencing-based isoform dynamics (nano-ID), a method that detects newly synthesized RNA isoforms and monitors isoform metabolism. Nano-ID combines metabolic RNA labeling, long-read nanopore sequencing of native RNA molecules, and machine learning. Nano-ID derives RNA stability estimates and evaluates stability determining factors such as RNA sequence, poly(A)-tail length, secondary structure, translation efficiency, and RNA-binding proteins. Application of nano-ID to the heat shock response in human cells reveals that many RNA isoforms change their stability. Nano-ID also shows that the metabolism of individual RNA isoforms differs strongly from that estimated for the combined RNA signal at a specific gene locus. Nano-ID enables studies of RNA metabolism at the level of single RNA molecules and isoforms in different cell states and conditions.

Compartment-specific and ELAVL1-coordinated regulation of intronic polyadenylation isoforms by doxorubicin.

Alina Chakraborty‎ et al.
Genome research‎
2022‎

Intronic polyadenylation (IPA) isoforms, which contain alternative last exons, are widely regulated in various biological processes and by many factors. However, little is known about their cytoplasmic regulation and translational status. In this study, we provide the first evidence that the genome-wide patterns of IPA isoform regulation during a biological process can be very distinct between the transcriptome and translatome, and between the nucleus and cytosol. Indeed, by 3'-seq analyses on breast cancer cells, we show that the genotoxic anticancer drug, doxorubicin, preferentially down-regulates the IPA to the last-exon (IPA:LE) isoform ratio in whole cells (as previously reported) but preferentially up-regulates it in polysomes. We further show that in nuclei, doxorubicin almost exclusively down-regulates the IPA:LE ratio, whereas in the cytosol, it preferentially up-regulates the isoform ratio, as in polysomes. Then, focusing on IPA isoforms that are up-regulated by doxorubicin in the cytosol and highly translated (up-regulated and/or abundant in polysomes), we identify several IPA isoforms that promote cell survival to doxorubicin. Mechanistically, by using an original approach of condition- and compartment-specific CLIP-seq (CCS-iCLIP) to analyze ELAVL1-RNA interactions in the nucleus and cytosol in the presence and absence of doxorubicin, as well as 3'-seq analyses upon ELAVL1 depletion, we show that the RNA-binding protein ELAVL1 mediates both nuclear down-regulation and cytosolic up-regulation of the IPA:LE isoform ratio in distinct sets of genes in response to doxorubicin. Altogether, these findings reveal differential regulation of the IPA:LE isoform ratio across subcellular compartments during drug response and its coordination by an RNA-binding protein.

Chromatin-sensitive cryptic promoters putatively drive expression of alternative protein isoforms in yeast.

Wu Wei‎ et al.
Genome research‎
2019‎

Cryptic transcription is widespread and generates a heterogeneous group of RNA molecules of unknown function. To improve our understanding of cryptic transcription, we investigated their transcription start site (TSS) usage, chromatin organization, and posttranscriptional consequences in Saccharomyces cerevisiae We show that TSSs of chromatin-sensitive internal cryptic transcripts retain comparable features of canonical TSSs in terms of DNA sequence, directionality, and chromatin accessibility. We define the 5' and 3' boundaries of cryptic transcripts and show that, contrary to RNA degradation-sensitive ones, they often overlap with the end of the gene, thereby using the canonical polyadenylation site, and associate to polyribosomes. We show that chromatin-sensitive cryptic transcripts can be recognized by ribosomes and may produce truncated polypeptides from downstream, in-frame start codons. Finally, we confirm the presence of the predicted polypeptides by reanalyzing N-terminal proteomic data sets. Our work suggests that a fraction of chromatin-sensitive internal cryptic promoters initiates the transcription of alternative truncated mRNA isoforms. The expression of these chromatin-sensitive isoforms is conserved from yeast to human, expanding the functional consequences of cryptic transcription and proteome complexity.

Comprehensive isoform-level analysis reveals the contribution of alternative isoforms to venom evolution and repertoire diversity.

Xinhai Ye‎ et al.
Genome research‎
2023‎

Animal venom systems have emerged as valuable models for investigating how novel polygenic phenotypes may arise from gene evolution by varying molecular mechanisms. However, a significant portion of venom genes produce alternative mRNA isoforms that have not been extensively characterized, hindering a comprehensive understanding of venom biology. In this study, we present a full-length isoform-level profiling workflow integrating multiple RNA sequencing technologies, allowing us to reconstruct a high-resolution transcriptome landscape of venom genes in the parasitoid wasp Pteromalus puparum Our findings demonstrate that more than half of the venom genes generate multiple isoforms within the venom gland. Through mass spectrometry analysis, we confirm that alternative splicing contributes to the diversity of venom proteins, acting as a mechanism for expanding the venom repertoire. Notably, we identified seven venom genes that exhibit distinct isoform usages between the venom gland and other tissues. Furthermore, evolutionary analyses of venom serpin3 and orcokinin further reveal that the co-option of an ancient isoform and a newly evolved isoform, respectively, contributes to venom recruitment, providing valuable insights into the genetic mechanisms driving venom evolution in parasitoid wasps. Overall, our study presents a comprehensive investigation of venom genes at the isoform level, significantly advancing our understanding of alternative isoforms in venom diversity and evolution and setting the stage for further in-depth research on venoms.

A human-specific switch of alternatively spliced AFMID isoforms contributes to TP53 mutations and tumor recurrence in hepatocellular carcinoma.

Kuan-Ting Lin‎ et al.
Genome research‎
2018‎

Pre-mRNA splicing can contribute to the switch of cell identity that occurs in carcinogenesis. Here, we analyze a large collection of RNA-seq data sets and report that splicing changes in hepatocyte-specific enzymes, such as AFMID and KHK, are associated with HCC patients' survival and relapse. The switch of AFMID isoforms is an early event in HCC development and is associated with driver mutations in TP53 and ARID1A The switch of AFMID isoforms is human-specific and not detectable in other species, including primates. Finally, we show that overexpression of the full-length AFMID isoform leads to a higher NAD+ level, lower DNA-damage response, and slower cell growth in HepG2 cells. The integrative analysis uncovered a mechanistic link between splicing switches, de novo NAD+ biosynthesis, driver mutations, and HCC recurrence.

An atlas of alternative splicing profiles and functional associations reveals new regulatory programs and genes that simultaneously express multiple major isoforms.

Javier Tapial‎ et al.
Genome research‎
2017‎

Alternative splicing (AS) generates remarkable regulatory and proteomic complexity in metazoans. However, the functions of most AS events are not known, and programs of regulated splicing remain to be identified. To address these challenges, we describe the Vertebrate Alternative Splicing and Transcription Database (VastDB), the largest resource of genome-wide, quantitative profiles of AS events assembled to date. VastDB provides readily accessible quantitative information on the inclusion levels and functional associations of AS events detected in RNA-seq data from diverse vertebrate cell and tissue types, as well as developmental stages. The VastDB profiles reveal extensive new intergenic and intragenic regulatory relationships among different classes of AS and previously unknown and conserved landscapes of tissue-regulated exons. Contrary to recent reports concluding that nearly all human genes express a single major isoform, VastDB provides evidence that at least 48% of multiexonic protein-coding genes express multiple splice variants that are highly regulated in a cell/tissue-specific manner, and that >18% of genes simultaneously express multiple major isoforms across diverse cell and tissue types. Isoforms encoded by the latter set of genes are generally coexpressed in the same cells and are often engaged by translating ribosomes. Moreover, they are encoded by genes that are significantly enriched in functions associated with transcriptional control, implying they may have an important and wide-ranging role in controlling cellular activities. VastDB thus provides an unprecedented resource for investigations of AS function and regulation.

Long-read sequencing of nascent RNA reveals coupling among RNA processing events.

Lydia Herzel‎ et al.
Genome research‎
2018‎

Pre-mRNA splicing is accomplished by the spliceosome, a megadalton complex that assembles de novo on each intron. Because spliceosome assembly and catalysis occur cotranscriptionally, we hypothesized that introns are removed in the order of their transcription in genomes dominated by constitutive splicing. Remarkably little is known about splicing order and the regulatory potential of nascent transcript remodeling by splicing, due to the limitations of existing methods that focus on analysis of mature splicing products (mRNAs) rather than substrates and intermediates. Here, we overcome this obstacle through long-read RNA sequencing of nascent, multi-intron transcripts in the fission yeast Schizosaccharomyces pombe Most multi-intron transcripts were fully spliced, consistent with rapid cotranscriptional splicing. However, an unexpectedly high proportion of transcripts were either fully spliced or fully unspliced, suggesting that splicing of any given intron is dependent on the splicing status of other introns in the transcript. Supporting this, mild inhibition of splicing by a temperature-sensitive mutation in prp2, the homolog of vertebrate U2AF65, increased the frequency of fully unspliced transcripts. Importantly, fully unspliced transcripts displayed transcriptional read-through at the polyA site and were degraded cotranscriptionally by the nuclear exosome. Finally, we show that cellular mRNA levels were reduced in genes with a high number of unspliced nascent transcripts during caffeine treatment, showing regulatory significance of cotranscriptional splicing. Therefore, overall splicing of individual nascent transcripts, 3' end formation, and mRNA half-life depend on the splicing status of neighboring introns, suggesting crosstalk among spliceosomes and the polyA cleavage machinery during transcription elongation.

The predicted RNA-binding protein regulome of axonal mRNAs.

Raphaëlle Luisier‎ et al.
Genome research‎
2023‎

Neurons are morphologically complex cells that rely on the compartmentalization of protein expression to develop and maintain their cytoarchitecture. The targeting of RNA transcripts to axons is one of the mechanisms that allows rapid local translation of proteins in response to extracellular signals. 3' Untranslated regions (UTRs) of mRNA are noncoding sequences that play a critical role in determining transcript localization and translation by interacting with specific RNA-binding proteins (RBPs). However, how 3' UTRs contribute to mRNA metabolism and the nature of RBP complexes responsible for these functions remains elusive. We performed 3' end sequencing of RNA isolated from cell bodies and axons of sympathetic neurons exposed to either nerve growth factor (NGF) or neurotrophin 3 (NTF3, also known as NT-3). NGF and NTF3 are growth factors essential for sympathetic neuron development through distinct signaling mechanisms. Whereas NTF3 acts mostly locally, NGF signal is retrogradely transported from axons to cell bodies. We discovered that both NGF and NTF3 affect transcription and alternative polyadenylation in the nucleus and induce the localization of specific 3' UTR isoforms to axons, including short 3' UTR isoforms found exclusively in axons. The integration of our data with CLIP sequencing data supports a model whereby long 3' UTR isoforms associate with RBP complexes in the nucleus and, upon reaching the axons, are remodeled locally into shorter isoforms. Our findings shed new light into the complex relationship between nuclear polyadenylation, mRNA localization, and local 3' UTR remodeling in developing neurons.

Improved definition of the mouse transcriptome via targeted RNA sequencing.

Giovanni Bussotti‎ et al.
Genome research‎
2016‎

Targeted RNA sequencing (CaptureSeq) uses oligonucleotide probes to capture RNAs for sequencing, providing enriched read coverage, accurate measurement of gene expression, and quantitative expression data. We applied CaptureSeq to refine transcript annotations in the current murine GRCm38 assembly. More than 23,000 regions corresponding to putative or annotated long noncoding RNAs (lncRNAs) and 154,281 known splicing junction sites were selected for targeted sequencing across five mouse tissues and three brain subregions. The results illustrate that the mouse transcriptome is considerably more complex than previously thought. We assemble more complete transcript isoforms than GENCODE, expand transcript boundaries, and connect interspersed islands of mapped reads. We describe a novel filtering pipeline that identifies previously unannotated but high-quality transcript isoforms. In this set, 911 GENCODE neighboring genes are condensed into 400 expanded gene models. Additionally, 594 GENCODE lncRNAs acquire an open reading frame (ORF) when their structure is extended with CaptureSeq. Finally, we validate our observations using current FANTOM and Mouse ENCODE resources.

The full-length transcriptome of C. elegans using direct RNA sequencing.

Nathan P Roach‎ et al.
Genome research‎
2020‎

Current transcriptome annotations have largely relied on short read lengths intrinsic to the most widely used high-throughput cDNA sequencing technologies. For example, in the annotation of the Caenorhabditis elegans transcriptome, more than half of the transcript isoforms lack full-length support and instead rely on inference from short reads that do not span the full length of the isoform. We applied nanopore-based direct RNA sequencing to characterize the developmental polyadenylated transcriptome of C. elegans Taking advantage of long reads spanning the full length of mRNA transcripts, we provide support for 23,865 splice isoforms across 14,611 genes, without the need for computational reconstruction of gene models. Of the isoforms identified, 3452 are novel splice isoforms not present in the WormBase WS265 annotation. Furthermore, we identified 16,342 isoforms in the 3' untranslated region (3' UTR), 2640 of which are novel and do not fall within 10 bp of existing 3'-UTR data sets and annotations. Combining 3' UTRs and splice isoforms, we identified 28,858 full-length transcript isoforms. We also determined that poly(A) tail lengths of transcripts vary across development, as do the strengths of previously reported correlations between poly(A) tail length and expression level, and poly(A) tail length and 3'-UTR length. Finally, we have formatted this data as a publicly accessible track hub, enabling researchers to explore this data set easily in a genome browser.

Quantitative RNA-seq meta-analysis of alternative exon usage in C. elegans.

Nicolas J Tourasse‎ et al.
Genome research‎
2017‎

Almost 20 years after the completion of the C. elegans genome sequence, gene structure annotation is still an ongoing process with new evidence for gene variants still being regularly uncovered by additional in-depth transcriptome studies. While alternative splice forms can allow a single gene to encode several functional isoforms, the question of how much spurious splicing is tolerated is still heavily debated. Here we gathered a compendium of 1682 publicly available C. elegans RNA-seq data sets to increase the dynamic range of detection of RNA isoforms, and obtained robust measurements of the relative abundance of each splicing event. While most of the splicing reads come from reproducibly detected splicing events, a large fraction of purported junctions is only supported by a very low number of reads. We devised an automated curation method that takes into account the expression level of each gene to discriminate robust splicing events from potential biological noise. We found that rarely used splice sites disproportionately come from highly expressed genes and are significantly less conserved in other nematode genomes than splice sites with a higher usage frequency. Our increased detection power confirmed trans-splicing for at least 84% of C. elegans protein coding genes. The genes for which trans-splicing was not observed are overwhelmingly low expression genes, suggesting that the mechanism is pervasive but not fully captured by organism-wide RNA-seq. We generated annotated gene models including quantitative exon usage information for the entire C. elegans genome. This allows users to visualize at a glance the relative expression of each isoform for their gene of interest.

Pan-human consensus genome significantly improves the accuracy of RNA-seq analyses.

Benjamin Kaminow‎ et al.
Genome research‎
2022‎

The Human Reference Genome serves as the foundation for modern genomic analyses. However, in its present form, it does not adequately represent the vast genetic diversity of the human population. In this study, we explored the consensus genome as a potential successor of the current reference genome and assessed its effect on the accuracy of RNA-seq read alignment. To find the best haploid genome representation, we constructed consensus genomes at the pan-human, superpopulation, and population levels, using variant information from The 1000 Genomes Project Consortium. Using personal haploid genomes as the ground truth, we compared mapping errors for real RNA-seq reads aligned to the consensus genomes versus the reference genome. For reads overlapping homozygous variants, we found that the mapping error decreased by a factor of approximately two to three when the reference was replaced with the pan-human consensus genome. We also found that using more population-specific consensuses resulted in little to no increase over using the pan-human consensus, suggesting a limit in the utility of incorporating a more specific genomic variation. Replacing the reference with consensus genomes impacts functional analyses, such as differential expressions of isoforms, genes, and splice junctions.

RNA-Bloom enables reference-free and reference-guided sequence assembly for single-cell transcriptomes.

Ka Ming Nip‎ et al.
Genome research‎
2020‎

Despite the rapid advance in single-cell RNA sequencing (scRNA-seq) technologies within the last decade, single-cell transcriptome analysis workflows have primarily used gene expression data while isoform sequence analysis at the single-cell level still remains fairly limited. Detection and discovery of isoforms in single cells is difficult because of the inherent technical shortcomings of scRNA-seq data, and existing transcriptome assembly methods are mainly designed for bulk RNA samples. To address this challenge, we developed RNA-Bloom, an assembly algorithm that leverages the rich information content aggregated from multiple single-cell transcriptomes to reconstruct cell-specific isoforms. Assembly with RNA-Bloom can be either reference-guided or reference-free, thus enabling unbiased discovery of novel isoforms or foreign transcripts. We compared both assembly strategies of RNA-Bloom against five state-of-the-art reference-free and reference-based transcriptome assembly methods. In our benchmarks on a simulated 384-cell data set, reference-free RNA-Bloom reconstructed 37.9%-38.3% more isoforms than the best reference-free assembler, whereas reference-guided RNA-Bloom reconstructed 4.1%-11.6% more isoforms than reference-based assemblers. When applied to a real 3840-cell data set consisting of more than 4 billion reads, RNA-Bloom reconstructed 9.7%-25.0% more isoforms than the best competing reference-based and reference-free approaches evaluated. We expect RNA-Bloom to boost the utility of scRNA-seq data beyond gene expression analysis, expanding what is informatically accessible now.

Direct full-length RNA sequencing reveals unexpected transcriptome complexity during Caenorhabditis elegans development.

Runsheng Li‎ et al.
Genome research‎
2020‎

Massively parallel sequencing of the polyadenylated RNAs has played a key role in delineating transcriptome complexity, including alternative use of an exon, promoter, 5' or 3' splice site or polyadenylation site, and RNA modification. However, reads derived from the current RNA-seq technologies are usually short and deprived of information on modification, compromising their potential in defining transcriptome complexity. Here, we applied a direct RNA sequencing method with ultralong reads using Oxford Nanopore Technologies to study the transcriptome complexity in Caenorhabditis elegans We generated approximately six million reads using native poly(A)-tailed mRNAs from three developmental stages, with average read lengths ranging from 900 to 1100 nt. Around half of the reads represent full-length transcripts. To utilize the full-length transcripts in defining transcriptome complexity, we devised a method to classify the long reads as the same as existing transcripts or as a novel transcript using sequence mapping tracks rather than existing intron/exon structures, which allowed us to identify roughly 57,000 novel isoforms and recover at least 26,000 out of the 33,500 existing isoforms. The sets of genes with differential expression versus differential isoform usage over development are largely different, implying a fine-tuned regulation at isoform level. We also observed an unexpected increase in putative RNA modification in all bases in the coding region relative to the UTR, suggesting their possible roles in translation. The RNA reads and the method for read classification are expected to deliver new insights into RNA processing and modification and their underlying biology in the future.

Long-read RNA sequencing reveals widespread sex-specific alternative splicing in threespine stickleback fish.

Alice S Naftaly‎ et al.
Genome research‎
2021‎

Alternate isoforms are important contributors to phenotypic diversity across eukaryotes. Although short-read RNA-sequencing has increased our understanding of isoform diversity, it is challenging to accurately detect full-length transcripts, preventing the identification of many alternate isoforms. Long-read sequencing technologies have made it possible to sequence full-length alternative transcripts, accurately characterizing alternative splicing events, alternate transcription start and end sites, and differences in UTR regions. Here, we use Pacific Biosciences (PacBio) long-read RNA-sequencing (Iso-Seq) to examine the transcriptomes of five organs in threespine stickleback fish (Gasterosteus aculeatus), a widely used genetic model species. The threespine stickleback fish has a refined genome assembly in which gene annotations are based on short-read RNA sequencing and predictions from coding sequence of other species. This suggests some of the existing annotations may be inaccurate or alternative transcripts may not be fully characterized. Using Iso-Seq we detected thousands of novel isoforms, indicating many isoforms are absent in the current Ensembl gene annotations. In addition, we refined many of the existing annotations within the genome. We noted many improperly positioned transcription start sites that were refined with long-read sequencing. The Iso-Seq-predicted transcription start sites were more accurate and verified through ATAC-seq. We also detected many alternative splicing events between sexes and across organs. We found a substantial number of genes in both somatic and gonadal samples that had sex-specific isoforms. Our study highlights the power of long-read sequencing to study the complexity of transcriptomes, greatly improving genomic resources for the threespine stickleback fish.

RNA components of the spliceosome regulate tissue- and cancer-specific alternative splicing.

Heidi Dvinge‎ et al.
Genome research‎
2019‎

Alternative splicing of pre-mRNAs plays a pivotal role during the establishment and maintenance of human cell types. Characterizing the trans-acting regulatory proteins that control alternative splicing has therefore been the focus of much research. Recent work has established that even core protein components of the spliceosome, which are required for splicing to proceed, can nonetheless contribute to splicing regulation by modulating splice site choice. We here show that the RNA components of the spliceosome likewise influence alternative splicing decisions. Although these small nuclear RNAs (snRNAs), termed U1, U2, U4, U5, and U6 snRNA, are present in equal stoichiometry within the spliceosome, we found that their relative levels vary by an order of magnitude during development, across tissues, and across cancer samples. Physiologically relevant perturbation of individual snRNAs drove widespread gene-specific differences in alternative splicing but not transcriptome-wide splicing failure. Genes that were particularly sensitive to variations in snRNA abundance in a breast cancer cell line model were likewise preferentially misspliced within a clinically diverse cohort of invasive breast ductal carcinomas. As aberrant mRNA splicing is prevalent in many cancers, we propose that a full understanding of such dysregulated pre-mRNA processing requires study of snRNAs, as well as protein splicing factors. Together, our data show that the RNA components of the spliceosome are not merely basal factors, as has long been assumed. Instead, these noncoding RNAs constitute a previously uncharacterized layer of regulation of alternative splicing, and contribute to the establishment of global splicing programs in both healthy and malignant cells.

Integrative analysis with ChIP-seq advances the limits of transcript quantification from RNA-seq.

Peng Liu‎ et al.
Genome research‎
2016‎

RNA-seq is currently the technology of choice for global measurement of transcript abundances in cells. Despite its successes, isoform-level quantification remains difficult because short RNA-seq reads are often compatible with multiple alternatively spliced isoforms. Existing methods rely heavily on uniquely mapping reads, which are not available for numerous isoforms that lack regions of unique sequence. To improve quantification accuracy in such difficult cases, we developed a novel computational method, prior-enhanced RSEM (pRSEM), which uses a complementary data type in addition to RNA-seq data. We found that ChIP-seq data of RNA polymerase II and histone modifications were particularly informative in this approach. In qRT-PCR validations, pRSEM was shown to be superior than competing methods in estimating relative isoform abundances within or across conditions. Data-driven simulations suggested that pRSEM has a greatly decreased false-positive rate at the expense of a small increase in false-negative rate. In aggregate, our study demonstrates that pRSEM transforms existing capacity to precisely estimate transcript abundances, especially at the isoform level.

End Sequence Analysis Toolkit (ESAT) expands the extractable information from single-cell RNA-seq data.

Alan Derr‎ et al.
Genome research‎
2016‎

RNA-seq protocols that focus on transcript termini are well suited for applications in which template quantity is limiting. Here we show that, when applied to end-sequencing data, analytical methods designed for global RNA-seq produce computational artifacts. To remedy this, we created the End Sequence Analysis Toolkit (ESAT). As a test, we first compared end-sequencing and bulk RNA-seq using RNA from dendritic cells stimulated with lipopolysaccharide (LPS). As predicted by the telescripting model for transcriptional bursts, ESAT detected an LPS-stimulated shift to shorter 3'-isoforms that was not evident by conventional computational methods. Then, droplet-based microfluidics was used to generate 1000 cDNA libraries, each from an individual pancreatic islet cell. ESAT identified nine distinct cell types, three distinct β-cell types, and a complex interplay between hormone secretion and vascularization. ESAT, then, offers a much-needed and generally applicable computational pipeline for either bulk or single-cell RNA end-sequencing.

PRAM: a novel pooling approach for discovering intergenic transcripts from large-scale RNA sequencing experiments.

Peng Liu‎ et al.
Genome research‎
2020‎

Publicly available RNA-seq data is routinely used for retrospective analysis to elucidate new biology. Novel transcript discovery enabled by joint analysis of large collections of RNA-seq data sets has emerged as one such analysis. Current methods for transcript discovery rely on a '2-Step' approach where the first step encompasses building transcripts from individual data sets, followed by the second step that merges predicted transcripts across data sets. To increase the power of transcript discovery from large collections of RNA-seq data sets, we developed a novel '1-Step' approach named Pooling RNA-seq and Assembling Models (PRAM) that builds transcript models from pooled RNA-seq data sets. We demonstrate in a computational benchmark that 1-Step outperforms 2-Step approaches in predicting overall transcript structures and individual splice junctions, while performing competitively in detecting exonic nucleotides. Applying PRAM to 30 human ENCODE RNA-seq data sets identified unannotated transcripts with epigenetic and RAMPAGE signatures similar to those of recently annotated transcripts. In a case study, we discovered and experimentally validated new transcripts through the application of PRAM to mouse hematopoietic RNA-seq data sets. We uncovered new transcripts that share a differential expression pattern with a neighboring gene Pik3cg implicated in human hematopoietic phenotypes, and we provided evidence for the conservation of this relationship in human. PRAM is implemented as an R/Bioconductor package.

Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

Native RNA sequencing in fission yeast reveals frequent alternative splicing isoforms.

Native molecule sequencing by nano-ID reveals synthesis and stability of RNA isoforms.

Compartment-specific and ELAVL1-coordinated regulation of intronic polyadenylation isoforms by doxorubicin.

Chromatin-sensitive cryptic promoters putatively drive expression of alternative protein isoforms in yeast.

Comprehensive isoform-level analysis reveals the contribution of alternative isoforms to venom evolution and repertoire diversity.

A human-specific switch of alternatively spliced AFMID isoforms contributes to TP53 mutations and tumor recurrence in hepatocellular carcinoma.

An atlas of alternative splicing profiles and functional associations reveals new regulatory programs and genes that simultaneously express multiple major isoforms.

Long-read sequencing of nascent RNA reveals coupling among RNA processing events.

The predicted RNA-binding protein regulome of axonal mRNAs.

Improved definition of the mouse transcriptome via targeted RNA sequencing.

The full-length transcriptome of C. elegans using direct RNA sequencing.

Quantitative RNA-seq meta-analysis of alternative exon usage in C. elegans.

Pan-human consensus genome significantly improves the accuracy of RNA-seq analyses.

RNA-Bloom enables reference-free and reference-guided sequence assembly for single-cell transcriptomes.

Direct full-length RNA sequencing reveals unexpected transcriptome complexity during Caenorhabditis elegans development.

Long-read RNA sequencing reveals widespread sex-specific alternative splicing in threespine stickleback fish.

RNA components of the spliceosome regulate tissue- and cancer-specific alternative splicing.

Integrative analysis with ChIP-seq advances the limits of transcript quantification from RNA-seq.

End Sequence Analysis Toolkit (ESAT) expands the extractable information from single-cell RNA-seq data.

PRAM: a novel pooling approach for discovering intergenic transcripts from large-scale RNA sequencing experiments.

SciCrunch.org Resources

Navigation

Logging in and Registering

Searching

Save Your Search

Query Expansion

Collections

Facets

Options

Further Questions

About

Recent News Entries

Contact Us

SciCrunch

Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

Log in

Log in

Literature

Current Facets and Filters

Options

Facets

Recent searches

.in-collection { color: green; } Native RNA sequencing in fission yeast reveals frequent alternative splicing isoforms.

.in-collection { color: green; } Native molecule sequencing by nano-ID reveals synthesis and stability of RNA isoforms.

.in-collection { color: green; } Compartment-specific and ELAVL1-coordinated regulation of intronic polyadenylation isoforms by doxorubicin.

.in-collection { color: green; } Chromatin-sensitive cryptic promoters putatively drive expression of alternative protein isoforms in yeast.

.in-collection { color: green; } Comprehensive isoform-level analysis reveals the contribution of alternative isoforms to venom evolution and repertoire diversity.

.in-collection { color: green; } A human-specific switch of alternatively spliced AFMID isoforms contributes to TP53 mutations and tumor recurrence in hepatocellular carcinoma.

.in-collection { color: green; } An atlas of alternative splicing profiles and functional associations reveals new regulatory programs and genes that simultaneously express multiple major isoforms.

.in-collection { color: green; } Long-read sequencing of nascent RNA reveals coupling among RNA processing events.

.in-collection { color: green; } The predicted RNA-binding protein regulome of axonal mRNAs.

.in-collection { color: green; } Improved definition of the mouse transcriptome via targeted RNA sequencing.

.in-collection { color: green; } The full-length transcriptome of C. elegans using direct RNA sequencing.

.in-collection { color: green; } Quantitative RNA-seq meta-analysis of alternative exon usage in C. elegans.

.in-collection { color: green; } Pan-human consensus genome significantly improves the accuracy of RNA-seq analyses.

.in-collection { color: green; } RNA-Bloom enables reference-free and reference-guided sequence assembly for single-cell transcriptomes.

.in-collection { color: green; } Direct full-length RNA sequencing reveals unexpected transcriptome complexity during Caenorhabditis elegans development.

.in-collection { color: green; } Long-read RNA sequencing reveals widespread sex-specific alternative splicing in threespine stickleback fish.

.in-collection { color: green; } RNA components of the spliceosome regulate tissue- and cancer-specific alternative splicing.

.in-collection { color: green; } Integrative analysis with ChIP-seq advances the limits of transcript quantification from RNA-seq.

.in-collection { color: green; } End Sequence Analysis Toolkit (ESAT) expands the extractable information from single-cell RNA-seq data.

.in-collection { color: green; } PRAM: a novel pooling approach for discovering intergenic transcripts from large-scale RNA sequencing experiments.

SciCrunch.org Resources

Navigation

Logging in and Registering

Searching

Save Your Search

Query Expansion

Collections

Facets

Options

Further Questions

Publications Per Year

About

Recent News Entries

Contact Us

SciCrunch

Native RNA sequencing in fission yeast reveals frequent alternative splicing isoforms.

Native molecule sequencing by nano-ID reveals synthesis and stability of RNA isoforms.

Compartment-specific and ELAVL1-coordinated regulation of intronic polyadenylation isoforms by doxorubicin.

Chromatin-sensitive cryptic promoters putatively drive expression of alternative protein isoforms in yeast.

Comprehensive isoform-level analysis reveals the contribution of alternative isoforms to venom evolution and repertoire diversity.

A human-specific switch of alternatively spliced AFMID isoforms contributes to TP53 mutations and tumor recurrence in hepatocellular carcinoma.

An atlas of alternative splicing profiles and functional associations reveals new regulatory programs and genes that simultaneously express multiple major isoforms.

Long-read sequencing of nascent RNA reveals coupling among RNA processing events.

The predicted RNA-binding protein regulome of axonal mRNAs.

Improved definition of the mouse transcriptome via targeted RNA sequencing.

The full-length transcriptome of C. elegans using direct RNA sequencing.

Quantitative RNA-seq meta-analysis of alternative exon usage in C. elegans.

Pan-human consensus genome significantly improves the accuracy of RNA-seq analyses.

RNA-Bloom enables reference-free and reference-guided sequence assembly for single-cell transcriptomes.

Direct full-length RNA sequencing reveals unexpected transcriptome complexity during Caenorhabditis elegans development.

Long-read RNA sequencing reveals widespread sex-specific alternative splicing in threespine stickleback fish.

RNA components of the spliceosome regulate tissue- and cancer-specific alternative splicing.

Integrative analysis with ChIP-seq advances the limits of transcript quantification from RNA-seq.

End Sequence Analysis Toolkit (ESAT) expands the extractable information from single-cell RNA-seq data.

PRAM: a novel pooling approach for discovering intergenic transcripts from large-scale RNA sequencing experiments.