This service exclusively searches for literature that cites resources. Please be aware that the total number of searchable documents is limited to those containing RRIDs and does not include all open-access literature.
Circular RNAs (circRNAs) have emerged as an important class of functional RNA molecules. Short-read RNA sequencing (RNA-seq) is a widely used strategy to identify circRNAs. However, an inherent limitation of short-read RNA-seq is that it does not experimentally determine the full-length sequences and exact exonic compositions of circRNAs. Here, we report isoCirc, a strategy for sequencing full-length circRNA isoforms, using rolling circle amplification followed by nanopore long-read sequencing. We describe an integrated computational pipeline to reliably characterize full-length circRNA isoforms using isoCirc data. Using isoCirc, we generate a comprehensive catalog of 107,147 full-length circRNA isoforms across 12 human tissues and one human cell line (HEK293), including 40,628 isoforms ≥500 nt in length. We identify widespread alternative splicing events within the internal part of circRNAs, including 720 retained intron events corresponding to a class of exon-intron circRNAs (EIciRNAs). Collectively, isoCirc and the companion dataset provide a useful strategy and resource for studying circRNAs in human transcriptomes.
RNA-Seq based transcriptome assembly has become a fundamental technique for studying expressed mRNAs (i.e., transcripts or isoforms) in a cell using high-throughput sequencing technologies, and is serving as a basis to analyze the structural and quantitative differences of expressed isoforms between samples. However, the current transcriptome assembly algorithms are not specifically designed to handle large amounts of errors that are inherent in real RNA-Seq datasets, especially those involving multiple samples, making downstream differential analysis applications difficult. On the other hand, multiple sample RNA-Seq datasets may provide more information than single sample datasets that can be utilized to improve the performance of transcriptome assembly and abundance estimation, but such information remains overlooked by the existing assembly tools.
RT-PCR and northern blots have long been used to study RNA isoforms usage for single genes. Recent advancements in long-read sequencing have yielded unprecedented information about the usage and abundance of these RNA isoforms. However, visualization of long-read sequencing data remains challenging due to the high information density. To alleviate these issues, we have developed NanoBlot, an open-source R-package that generates northern blot and RT-PCR-like images from long-read sequencing data. NanoBlot requires aligned, positionally sorted and indexed BAM files. Plotting is based around ggplot2 and is easily customizable. Advantages of NanoBlot include a robust system for designing probes to visualize isoforms including excluding reads based on the presence or absence of a specified region, an elegant solution to representing isoforms with continuous variations in length, and the ability to overlay multiple genes in the same plot using different colors. We present examples of nanoblots compared to actual northern blot data. In addition to traditional gel-like images, the NanoBlot package can also output other visualizations such as violin plots and 3'-RACE-like plots focused on 3'-end isoforms visualization. The use of the NanoBlot package should provide a simple answer to some of the challenges of visualizing long-read RNA-sequencing data.
Transcriptional terminators signal where transcribing RNA polymerases (RNAPs) should halt and disassociate from DNA. However, because termination is stochastic, two different forms of transcript could be produced: one ending at the terminator and the other reading through. An ability to control the abundance of these transcript isoforms would offer bioengineers a mechanism to regulate multi-gene constructs at the level of transcription. Here, we explore this possibility by repurposing terminators as 'transcriptional valves' that can tune the proportion of RNAP read-through. Using one-pot combinatorial DNA assembly, we iteratively construct 1780 transcriptional valves for T7 RNAP and show how nanopore-based direct RNA sequencing (dRNA-seq) can be used to characterize entire libraries of valves simultaneously at a nucleotide resolution in vitro and unravel genetic design principles to tune and insulate termination. Finally, we engineer valves for multiplexed regulation of CRISPR guide RNAs. This work provides new avenues for controlling transcription and demonstrates the benefits of long-read sequencing for exploring complex sequence-function landscapes.
The unicellular yeast Schizosaccharomyces pombe (fission yeast) retains many of the splicing features observed in humans and is thus an excellent model to study the basic mechanisms of splicing. Nearly half the genes contain introns, but the impact of alternative splicing in gene regulation and proteome diversification remains largely unexplored. Here we leverage Oxford Nanopore Technologies native RNA sequencing (dRNA), as well as ribosome profiling data, to uncover the full range of polyadenylated transcripts and translated open reading frames. We identify 332 alternative isoforms affecting the coding sequences of 262 different genes, 97 of which occur at frequencies higher than 20%, indicating that functional alternative splicing in S. pombe is more prevalent than previously suspected. Intron retention events make about 80% of the cases; these events may be involved in the regulation of gene expression and, in some cases, generate novel protein isoforms, as supported by ribosome profiling data in 18 of the intron retention isoforms. One example is the rpl22 gene, in which intron retention is associated with the translation of a protein of only 13 amino acids. We also find that lowly expressed transcripts tend to have longer poly(A) tails than highly expressed transcripts, highlighting an interdependence between poly(A) tail length and transcript expression level. Finally, we discover 214 novel transcripts that are not annotated, including 158 antisense transcripts, some of which also show translation evidence. The methodologies described in this work open new opportunities to study the regulation of splicing in a simple eukaryotic model.
Long noncoding RNAs (lncRNAs) undergo splicing and have multiple transcribed isoforms. Nevertheless, for lncRNAs, as well as for mRNA, measurements of expression are routinely performed only at the gene level. Metformin is the first-line oral therapy for type 2 diabetes mellitus and other metabolic diseases. However, its mechanism of action remains not thoroughly explained. Transcriptomic analyses using metformin in different cell types reveal that only protein-coding genes are considered. We aimed to characterize lncRNA isoforms that were differentially affected by metformin treatment on multiple human cell types (three cancer, two non-cancer) and to provide insights into the lncRNA regulation by this drug. We selected six series to perform a differential expression (DE) isoform analysis. We also inferred the biological roles for lncRNA DE isoforms using in silico tools. We found the same isoform of an lncRNA (AC016831.6-205) highly expressed in all six metformin series, which has a second exon putatively coding for a peptide with relevance to the drug action. Moreover, the other two lncRNA isoforms (ZBED5-AS1-207 and AC125807.2-201) may also behave as cis-regulatory elements to the expression of transcripts in their vicinity. Our results strongly reinforce the importance of considering DE isoforms of lncRNA for understanding metformin mechanisms at the molecular level.
The use of alternative promoters, splicing, and cleavage and polyadenylation (APA) generates mRNA isoforms that expand the diversity and complexity of the transcriptome. Here, we uncovered thousands of previously undescribed 5' uncapped and polyadenylated transcripts (5' UPTs). We show that these transcripts resist exonucleases due to a highly structured RNA and N6-methyladenosine modification at their 5' termini. 5' UPTs appear downstream of APA sites within their host genes and are induced upon APA activation. Strong enrichment in polysomal RNA fractions indicates 5' UPT translational potential. Indeed, APA promotes downstream translation initiation, non-canonical protein output, and consistent changes to peptide presentation at the cell surface. Lastly, we demonstrate the biological importance of 5' UPTs using Bcl2, a prominent anti-apoptotic gene whose entire coding sequence is a 5' UPT generated from 5' UTR-embedded APA sites. Thus, APA is not only accountable for terminating transcripts, but also for generating downstream uncapped RNAs with translation potential and biological impact.
The development of techniques for sequencing the messenger RNA (RNA-Seq) enables it to study the biological mechanisms such as alternative splicing and gene expression regulation more deeply and accurately. Most existing methods employ RNA-Seq to quantify the expression levels of already annotated isoforms from the reference genome. However, the current reference genome is very incomplete due to the complexity of the transcriptome which hiders the comprehensive investigation of transcriptome using RNA-Seq. Novel study on isoform inference and estimation purely from RNA-Seq without annotation information is desirable.
Horn cancer (HC) is a squamous cell carcinoma of horn, commonly observed in Bos indicus of the Asian countries. To elucidate the complexity of alternative splicing present in the HC, high-throughput sequencing and analysis of HC and matching horn normal (HN) tissue were carried out. A total of 535,067 and 849,077 reads were analysed after stringent quality filtering for HN and HC, respectively. Cufflinks pipeline for transcriptome analysis revealed 4786 novel splice isoforms comprising 2432 exclusively in HC, 2055 exclusively in HN and 298 in both the conditions. Based on pathway clustering and in silico verification, 102 novel splice isoforms were selected and further analysed with respect to change in protein sequence using Blastp. Finally, fourteen novel splicing events supported both by Cufflinks and UCSC genome browser were selected and confirmed expression by RT-qPCR. Future studies targeted at in-depth characterization of these potential candidate splice isoforms might be helpful in the development of relevant biomarkers for early diagnosis of HC. The results reported in this study refine the available information on transcriptome repertoire of bovine species and boost the research in the line of development of relevant biomarkers for early diagnosis of HC.
Efficient control of transcription is essential in all organisms. In bacteria, where DNA replication and transcription occur simultaneously, the replication machinery is at risk of colliding with highly abundant transcription complexes. This can be exacerbated by the fact that transcription complexes pause frequently. When pauses are long-lasting, the stalled complexes must be removed to prevent collisions with either another transcription complex or the replication machinery. HelD is a protein that represents a new class of ATP-dependent motor proteins distantly related to helicases. It was first identified in the model Gram-positive bacterium Bacillus subtilis and is involved in removing and recycling stalled transcription complexes. To date, two classes of HelD have been identified: one in the low G+C and the other in the high G+C Gram-positive bacteria. In this work, we have undertaken the first comprehensive investigation of the phylogenetic diversity of HelD proteins. We show that genes in certain bacterial classes have been inherited by horizontal gene transfer, many organisms contain multiple expressed isoforms of HelD, some of which are associated with antibiotic resistance, and that there is a third class of HelD protein found in Gram-negative bacteria. In summary, HelD proteins represent an important new class of transcription factors associated with genome maintenance and antibiotic resistance that are conserved across the Eubacterial kingdom.
Balanced processing of HIV-1 RNA is critical to virus replication and is regulated by host factors. In this report, we demonstrate that overexpression of either Tra2α or Tra2β results in a marked reduction in HIV-1 Gag/Env expression, an effect associated with changes in HIV-1 RNA accumulation, altered viral splice site usage, and a block to export of HIV-1 genomic RNA. A natural isoform of Tra2β (Tra2ß3), lacking the N-terminal RS domain, also suppressed HIV-1 expression but had different effects on viral RNA processing. The functional differences between the Tra2β isoforms were also observed in the context of another RNA substrate indicating that these factors have distinct functions within the cell. Finally, we demonstrate that Tra2ß depletion results in a selective reduction in HIV-1 Env expression as well as an increase in multiply spliced viral RNA. Together, the findings indicate that Tra2α/β can play important roles in regulating HIV-1 RNA metabolism and expression.
Alternative splicing, polyadenylation of pre-messenger RNA molecules and differential promoter usage can produce a variety of transcript isoforms whose respective expression levels are regulated in time and space, thus contributing specific biological functions. However, the repertoire of mammalian alternative transcripts and their regulation are still poorly understood. Second-generation sequencing is now opening unprecedented routes to address the analysis of entire transcriptomes. Here, we developed methods that allow the prediction and quantification of alternative isoforms derived solely from exon expression levels in RNA-Seq data. These are based on an explicit statistical model and enable the prediction of alternative isoforms within or between conditions using any known gene annotation, as well as the relative quantification of known transcript structures. Applying these methods to a human RNA-Seq dataset, we validated a significant fraction of the predictions by RT-PCR. Data further showed that these predictions correlated well with information originating from junction reads. A direct comparison with exon arrays indicated improved performances of RNA-Seq over microarrays in the prediction of skipped exons. Altogether, the set of methods presented here comprehensively addresses multiple aspects of alternative isoform analysis. The software is available as an open-source R-package called Solas at http://cmb.molgen.mpg.de/2ndGenerationSequencing/Solas/.
Eukaryotic genes often generate a variety of RNA isoforms that can lead to functionally distinct protein variants. The synthesis and stability of RNA isoforms is poorly characterized because current methods to quantify RNA metabolism use short-read sequencing and cannot detect RNA isoforms. Here we present nanopore sequencing-based isoform dynamics (nano-ID), a method that detects newly synthesized RNA isoforms and monitors isoform metabolism. Nano-ID combines metabolic RNA labeling, long-read nanopore sequencing of native RNA molecules, and machine learning. Nano-ID derives RNA stability estimates and evaluates stability determining factors such as RNA sequence, poly(A)-tail length, secondary structure, translation efficiency, and RNA-binding proteins. Application of nano-ID to the heat shock response in human cells reveals that many RNA isoforms change their stability. Nano-ID also shows that the metabolism of individual RNA isoforms differs strongly from that estimated for the combined RNA signal at a specific gene locus. Nano-ID enables studies of RNA metabolism at the level of single RNA molecules and isoforms in different cell states and conditions.
RNA-sequencing and tailored bioinformatic methodologies have paved the way for identification of expressed fusion genes from the chaotic genomes of solid tumors. We have recently successfully exploited RNA-sequencing for the discovery of 24 novel fusion genes in breast cancer. Here, we demonstrate the importance of continuous optimization of the bioinformatic methodology for this purpose, and report the discovery and experimental validation of 13 additional fusion genes from the same samples. Integration of copy number profiling with the RNA-sequencing results revealed that the majority of the gene fusions were promoter-donating events that occurred at copy number transition points or involved high-level DNA-amplifications. Sequencing of genomic fusion break points confirmed that DNA-level rearrangements underlie selected fusion transcripts. Furthermore, a significant portion (>60%) of the fusion genes were alternatively spliced. This illustrates the importance of reanalyzing sequencing data as gene definitions change and bioinformatic methods improve, and highlights the previously unforeseen isoform diversity among fusion transcripts.
Alternative splicing generates differing RNA isoforms that govern phenotypic complexity of eukaryotes. Its malfunction underlies many diseases, including cancer and cardiovascular diseases. Comparative analysis of RNA isoforms at the genome-wide scale has been difficult. Here, we establish an experimental and computational pipeline that performs de novo transcript annotation and accurately quantifies transcript isoforms from cDNA sequences with a full-length isoform detection accuracy of 97.6%. We generate a searchable, quantitative human transcriptome annotation with 31,025 known and 5,740 novel transcript isoforms ( http://steinmetzlab.embl.de/iBrowser/ ). By analyzing the isoforms in the presence of RNA Binding Motif Protein 20 (RBM20) mutations associated with aggressive dilated cardiomyopathy (DCM), we identify 121 differentially expressed transcript isoforms in 107 cardiac genes. Our approach enables quantitative dissection of complex transcript architecture instead of mere identification of inclusion or exclusion of individual exons, as exemplified by the discovery of IMMT isoforms mis-spliced by RBM20 mutations. Thereby we achieve a path to direct differential expression testing independent of an existing annotation of transcript isoforms, providing more immediate biological interpretation and higher resolution transcriptome comparisons.
The human cytomegalovirus (HCMV) is a ubiquitous, human pathogenic herpesvirus. The complete viral genome is transcriptionally active during infection; however, a large part of its transcriptome has yet to be annotated. In this work, we applied the amplified isoform sequencing technique from Pacific Biosciences to characterize the lytic transcriptome of HCMV strain Towne varS. We developed a pipeline for transcript annotation using long-read sequencing data. We identified 248 transcriptional start sites, 116 transcriptional termination sites and 80 splicing events. Using this information, we have annotated 291 previously undescribed or only partially annotated transcript isoforms, including eight novel antisense transcripts and their isoforms, as well as a novel transcript (RS2) in the short repeat region, partially antisense to RS1. Similarly to other organisms, we discovered a high transcriptional diversity in HCMV, with many transcripts only slightly differing from one another. Comparing our transcriptome profiling results to an earlier ribosome footprint analysis, we have concluded that the majority of the transcripts contain multiple translationally active ORFs, and also that most isoforms contain unique combinations of ORFs. Based on these results, we propose that one important function of this transcriptional diversity may be to provide a regulatory mechanism at the level of translation.
Direct RNA sequencing holds great promise for the de novo identification of RNA modifications at single-coordinate resolution; however, interpretation of raw sequencing output to discover modified bases remains a challenge. Using Oxford Nanopore's direct RNA sequencing technology, we developed a random forest classifier trained using experimentally detected N6-methyladenosine (m6A) sites within DRACH motifs. Our software MINES (m6A Identification using Nanopore Sequencing) assigned m6A methylation status to more than 13,000 previously unannotated DRACH sites in endogenous HEK293T transcripts and identified more than 40,000 sites with isoform-level resolution in a human mammary epithelial cell line. These sites displayed sensitivity to the m6A writer, METTL3, and eraser, ALKBH5, respectively. MINES (https://github.com/YeoLab/MINES.git) enables m6A annotation at single coordinate-level resolution from direct RNA nanopore sequencing.
Facioscapulohumeral muscular dystrophy (FSHD) is an inherited muscle disease caused by misexpression of the DUX4 gene in skeletal muscle. DUX4 is a transcription factor, which is normally expressed in the cleavage-stage embryo and regulates gene expression involved in early embryonic development. Recent studies revealed that DUX4 also activates the transcription of repetitive elements such as endogenous retroviruses (ERVs), mammalian apparent long terminal repeat (LTR)-retrotransposons and pericentromeric satellite repeats (Human Satellite II). DUX4-bound ERV sequences also create alternative promoters for genes or long non-coding RNAs, producing fusion transcripts. To further understand transcriptional regulation by DUX4, we performed nanopore long-read direct RNA sequencing (dRNA-seq) of human muscle cells induced by DUX4, because long reads show whole isoforms with greater confidence. We successfully detected differential expression of known DUX4-induced genes and discovered 61 differentially expressed repeat loci, which are near DUX4-ChIP peaks. We also identified 247 gene-ERV fusion transcripts, of which 216 were not reported previously. In addition, long-read dRNA-seq clearly shows that RNA splicing is a common event in DUX4-activated ERV transcripts. Long-read analysis showed non-LTR transposons including Alu elements are also transcribed from LTRs. Our findings revealed further complexity of DUX4-induced ERV transcripts. This catalogue of DUX4-activated repetitive elements may provide useful information to elucidate the pathology of FSHD. Also, our results indicate that nanopore dRNA-seq has complementary strengths to conventional short-read complementary DNA sequencing.
Biological and biomedical research relies on comprehensive understanding of protein-coding transcripts. However, the total number of human proteins is still unknown due to the prevalence of alternative splicing. In this paper, we detected 31,566 novel transcripts with coding potential by filtering our ab initio predictions with 50 RNA-seq datasets from diverse tissues/cell lines. PCR followed by MiSeq sequencing showed that at least 84.1% of these predicted novel splice sites could be validated. In contrast to known transcripts, the expression of these novel transcripts were highly tissue-specific. Based on these novel transcripts, at least 36 novel proteins were detected from shotgun proteomics data of 41 breast samples. We also showed L1 retrotransposons have a more significant impact on the origin of new transcripts/genes than previously thought. Furthermore, we found that alternative splicing is extraordinarily widespread for genes involved in specific biological functions like protein binding, nucleoside binding, neuron projection, membrane organization and cell adhesion. In the end, the total number of human transcripts with protein-coding potential was estimated to be at least 204,950.
Welcome to the FDI Lab - SciCrunch.org Resources search. From here you can search through a compilation of resources used by FDI Lab - SciCrunch.org and see how data is organized within our community.
You are currently on the Community Resources tab looking through categories and sources that FDI Lab - SciCrunch.org has compiled. You can navigate through those categories from here or change to a different tab to execute your search through. Each tab gives a different perspective on data.
If you have an account on FDI Lab - SciCrunch.org then you can log in from here to get additional features in FDI Lab - SciCrunch.org such as Collections, Saved Searches, and managing Resources.
Here is the search term that is being executed, you can type in anything you want to search for. Some tips to help searching:
You can save any searches you perform for quick access to later from here.
We recognized your search term and included synonyms and inferred terms along side your term to help get the data you are looking for.
If you are logged into FDI Lab - SciCrunch.org you can add data records to your collections to create custom spreadsheets across multiple sources of data.
Here are the facets that you can filter your papers by.
From here we'll present any options for the literature, such as exporting your current results.
If you have any further questions please check out our FAQs Page to ask questions and see our tutorials. Click this button to view this tutorial again.
Year:
Count: