This service exclusively searches for literature that cites resources. Please be aware that the total number of searchable documents is limited to those containing RRIDs and does not include all open-access literature.
DNA methylation plays critical roles in transcriptional regulation and chromatin remodeling. Differentially methylated regions (DMRs) have important implications for development, aging and diseases. Therefore, genome-wide mapping of DMRs across various temporal and spatial methylomes is important in revealing the impact of epigenetic modifications on heritable phenotypic variation. We present a quantitative approach, quantitative differentially methylated regions (QDMRs), to quantify methylation difference and identify DMRs from genome-wide methylation profiles by adapting Shannon entropy. QDMR was applied to synthetic methylation patterns and methylation profiles detected by methylated DNA immunoprecipitation microarray (MeDIP-chip) in human tissues/cells. This approach can give a reasonable quantitative measure of methylation difference across multiple samples. Then DMR threshold was determined from methylation probability model. Using this threshold, QDMR identified 10,651 tissue DMRs which are related to the genes enriched for cell differentiation, including 4740 DMRs not identified by the method developed by Rakyan et al. QDMR can also measure the sample specificity of each DMR. Finally, the application to methylation profiles detected by reduced representation bisulphite sequencing (RRBS) in mouse showed the platform-free and species-free nature of QDMR. This approach provides an effective tool for the high-throughput identification of potential functional regions involved in epigenetic regulation.
Synthesis of selenoproteins depends on decoding of the UGA stop codon as the amino acid selenocysteine (Sec). This process requires the presence of a Sec insertion sequence element (SECIS) in the 3'-untranslated region of selenoprotein mRNAs and its interaction with the SECIS binding protein 2 (SBP2). In humans, mutations in the SBP2-encoding gene Sec insertion sequence binding protein 2 (SECISBP2) that alter the amino acid sequence or cause splicing defects lead to abnormal thyroid hormone metabolism. Herein, we present the first in silico and in vivo functional characterization of alternative splicing of SECISBP2. We report a complex splicing pattern in the 5'-region of human SECISBP2, wherein at least eight splice variants encode five isoforms with varying N-terminal sequence. One of the isoforms, mtSBP2, contains a mitochondrial targeting sequence and localizes to mitochondria. Using a minigene-based in vivo splicing assay we characterized the splicing efficiency of several alternative transcripts, and show that the splicing event that creates mtSBP2 can be modulated by antisense oligonucleotides. Moreover, we show that full-length SBP2 and some alternatively spliced variants are subject to a coordinated transcriptional and translational regulation in response to ultraviolet type A irradiation-induced stress. Overall, our data broadens the functional scope of a housekeeping protein essential to selenium metabolism.
5-Hydroxymethylcytosine (5hmC) is present in T-even phage and mammalian DNA as well as some nucleoside antibiotics, including mildiomycin and bacimethrin, during whose synthesis 5hmC is produced by the hydrolysis of 5-hydroxymethyl cytidine 5'-monophosphate (hmCMP) by an N-glycosidase MilB. Recently, the MilB-CMP complex structure revealed its substrate specificity for CMP over dCMP. However, hmCMP instead of CMP is the preferred substrate for MilB as supported by that its KM for CMP is ∼27-fold higher than that for hmCMP. Here, we determined the crystal structures of MilB and its catalytically inactive E103A mutant in complex with hmCMP. In the structure of the complex, Phe22 and Arg23 are positioned in a cage-like active site resembling the binding pocket for the flipped 5-methylcytosine (5mC) in eukaryotic 5mC-binding proteins. Van der Waals interaction between the benzene ring of Phe22 and the pyrimidine ring of hmCMP stabilizes its binding. Remarkably, upon hmCMP binding, the guanidinium group of Arg23 was bent ∼65° toward hmCMP to recognize its 5-hydroxymethyl group, inducing semi-closure of the cage-like pocket. Mutagenesis studies of Arg23 and bioinformatics analysis demonstrate that the positively charged Arg/Lys at this site is critical for the specific recognition of the 5-hydroxymethyl group of hmCMP.
In silico prediction of genomic long non-coding RNAs (lncRNAs) is prerequisite to the construction and elucidation of non-coding regulatory network. Chromatin modifications marked by chromatin regulators are important epigenetic features, which can be captured by prevailing high-throughput approaches such as ChIP sequencing. We demonstrate that the accuracy of lncRNA predictions can be greatly improved when incorporating high-throughput chromatin modifications over mouse embryonic stem differentiation toward adult Cerebellum by logistic regression with LASSO regularization. The discriminating features include H3K9me3, H3K27ac, H3K4me1, open reading frames and several repeat elements. Importantly, chromatin information is suggested to be complementary to genomic sequence information, highlighting the importance of an integrated model. Applying integrated model, we obtain a list of putative lncRNAs based on uncharacterized fragments from transcriptome assembly. We demonstrate that the putative lncRNAs have regulatory roles in vicinity of known gene loci by expression and Gene Ontology enrichment analysis. We also show that the lncRNA expression specificity can be efficiently modeled by the chromatin data with same developmental stage. The study not only supports the biological hypothesis that chromatin can regulate expression of tissue-specific or developmental stage-specific lncRNAs but also reveals the discriminating features between lncRNA and coding genes, which would guide further lncRNA identifications and characterizations.
The transforming growth factor-β (TGF-β) signalling pathway participates in various biological processes. Dysregulation of Smad4, a central cellular transducer of TGF-β signalling, is implicated in a wide range of human diseases and developmental disorders. However, the mechanisms underlying Smad4 dysregulation are not fully understood. Using a functional screening approach based on luciferase reporter assays, we identified 39 microRNAs (miRNAs) as potential regulators of Smad4 from an expression library of 388 human miRNAs. The screening was supported by bioinformatic analysis, as 24 of 39 identified miRNAs were also predicted to target Smad4. MiR-199a, one of the identified miRNAs, was inversely correlated with Smad4 expression in various human cancer cell lines and gastric cancer tissues, and repressed Smad4 expression and blocked canonical TGF-β transcriptional responses in cell lines. These effects were dependent on the presence of a conserved, but not perfect seed paired, miR-199a-binding site in the Smad4 3'-untranslated region (UTR). Overexpression of miR-199a significantly inhibited the ability of TGF-β to induce gastric cancer cell growth arrest and apoptosis in vitro, and promoted anchorage-independent growth in soft agar, suggesting that miR-199a plays an oncogenic role in human gastric tumourigenesis. In conclusion, our functional screening uncovers multiple miRNAs that regulate the cellular responsiveness to TGF-β signalling and reveals important roles of miR-199a in gastric cancer by directly targeting Smad4.
GeneHub-GEPIS is a web application that performs digital expression analysis in human and mouse tissues based on an integrated gene database. Using aggregated expressed sequence tag (EST) library information and EST counts, the application calculates the normalized gene expression levels across a large panel of normal and tumor tissues, thus providing rapid expression profiling for a given gene. The backend GeneHub component of the application contains pre-defined gene structures derived from mRNA transcript sequences from major databases and includes extensive cross references for commonly used gene identifiers. ESTs are then linked to genes based on their precise genomic locations as determined by GMAP. This genome-based approach reduces incorrect matches between ESTs and genes, thus minimizing the noise seen with previous tools. In addition, the gene-centric design makes it possible to add several important features, including text searching capabilities, the ability to accept diverse input values, expression analysis for microRNAs, basic gene annotation, batch analysis and linking between mouse and human genes. GeneHub-GEPIS is available at http://www.cgl.ucsf.edu/Research/genentech/genehub-gepis/ or http://www.gepis.org/.
MicroRNAs (miRNAs) have recently been proposed as a versatile class of molecules involved in regulation of a variety of biological processes. However, the role of miRNAs in TGF-beta-regulated biological processes is poorly addressed. In this study, we found that miR-24 was upregulated during myoblast differentiation and could be inhibited by TGF-beta1. Using both a reporter assay and Northern blot analysis, we showed that TGF-beta1 repressed miR-24 transcription which was dependent on the presence of Smad3 and a Smads binding site in the promoter region of miR-24. TGF-beta1 was unable to inhibit miR-24 expression in Smad3-deficient myoblasts, which exhibited accelerated myogenesis. Knockdown of miR-24 led to reduced expression of myogenic differentiation markers in C2C12 cells, while ectopic expression of miR-24 enhanced differentiation, and partially rescued inhibited myogenesis by TGF-beta1. This is the first study demonstrating a critical role for miRNAs in modulating TGF-beta-dependent inhibition of myogenesis, and provides a novel mechanism of the genetic regulation of TGF-beta signaling during skeletal muscle differentiation.
Circadian rhythm exerts its influence on animal physiology and behavior by regulating gene expression at various levels. Here we systematically explored circadian long non-coding RNAs (lncRNAs) in mouse liver and examined their circadian regulation. We found that a significant proportion of circadian lncRNAs are expressed at enhancer regions, mostly bound by two key circadian transcription factors, BMAL1 and REV-ERBα. These circadian lncRNAs showed similar circadian phases with their nearby genes. The extent of their nuclear localization is higher than protein coding genes but less than enhancer RNAs. The association between enhancer and circadian lncRNAs is also observed in tissues other than liver. Comparative analysis between mouse and rat circadian liver transcriptomes showed that circadian transcription at lncRNA loci tends to be conserved despite of low sequence conservation of lncRNAs. One such circadian lncRNA termed lnc-Crot led us to identify a super-enhancer region interacting with a cluster of genes involved in circadian regulation of metabolism through long-range interactions. Further experiments showed that lnc-Crot locus has enhancer function independent of lnc-Crot's transcription. Our results suggest that the enhancer-associated circadian lncRNAs mark the genomic loci modulating long-range circadian gene regulation and shed new lights on the evolutionary origin of lncRNAs.
Epigenetic alterations, including 5-methylcytosine (5mC), 5-hydroxymethylcytosine (5hmC) and nucleosome positioning (NP), in cell-free DNA (cfDNA) have been widely observed in human diseases, and many available cfDNA-based epigenome-wide profiles exhibit high sensitivity and specificity in disease detection and classification. However, due to the lack of efficient collection, standardized quality control, and analysis procedures, efficiently integrating and reusing these data remain considerable challenges. Here, we introduce CFEA (http://www.bio-data.cn/CFEA), a cell-free epigenome database dedicated to three types of widely adopted epigenetic modifications (5mC, 5hmC and NP) involved in 27 human diseases. We developed bioinformatic pipelines for quality control and standard data processing and an easy-to-use web interface to facilitate the query, visualization and download of these cell-free epigenome data. We also manually curated related biological and clinical information for each profile, allowing users to better browse and compare cfDNA epigenomes at a specific stage (such as early- or metastasis-stage) of cancer development. CFEA provides a comprehensive and timely resource to the scientific community and supports the development of liquid biopsy-based biomarkers for various human diseases.
The GENCODE project annotates human and mouse genes and transcripts supported by experimental data with high accuracy, providing a foundational resource that supports genome biology and clinical genomics. GENCODE annotation processes make use of primary data and bioinformatic tools and analysis generated both within the consortium and externally to support the creation of transcript structures and the determination of their function. Here, we present improvements to our annotation infrastructure, bioinformatics tools, and analysis, and the advances they support in the annotation of the human and mouse genomes including: the completion of first pass manual annotation for the mouse reference genome; targeted improvements to the annotation of genes associated with SARS-CoV-2 infection; collaborative projects to achieve convergence across reference annotation databases for the annotation of human and mouse protein-coding genes; and the first GENCODE manually supervised automated annotation of lncRNAs. Our annotation is accessible via Ensembl, the UCSC Genome Browser and https://www.gencodegenes.org.
Many open access transcriptomic data of coronavirus disease 2019 (COVID-19) were generated, they have great heterogeneity and are difficult to analyze. To utilize these invaluable data for better understanding of COVID-19, additional software should be developed. Especially for researchers without bioinformatic skills, a user-friendly platform is mandatory. We developed the COVID19db platform (http://hpcc.siat.ac.cn/covid19db & http://www.biomedical-web.com/covid19db) that provides 39 930 drug-target-pathway interactions and 95 COVID-19 related datasets, which include transcriptomes of 4127 human samples across 13 body sites associated with the exposure of 33 microbes and 33 drugs/agents. To facilitate data application, each dataset was standardized and annotated with rich clinical information. The platform further provides 14 different analytical applications to analyze various mechanisms underlying COVID-19. Moreover, the 14 applications enable researchers to customize grouping and setting for different analyses and allow them to perform analyses using their own data. Furthermore, a Drug Discovery tool is designed to identify potential drugs and targets at whole transcriptomic scale. For proof of concept, we used COVID19db and identified multiple potential drugs and targets for COVID-19. In summary, COVID19db provides user-friendly web interfaces to freely analyze, download data, and submit new data for further integration, it can accelerate the identification of effective strategies against COVID-19.
Heterochromatin plays essential roles in eukaryotic genomes, such as regulating genes, maintaining genome integrity and silencing repetitive DNA elements. Identifying genome-wide heterochromatin regions is crucial for studying transcriptional regulation. We propose the Human Heterochromatin Chromatin Database (HHCDB) for archiving heterochromatin regions defined by specific or combined histone modifications (H3K27me3, H3K9me2, H3K9me3) according to a unified pipeline. 42 839 743 heterochromatin regions were identified from 578 samples derived from 241 cell-types/cell lines and 92 tissue types. Genomic information is provided in HHCDB, including chromatin location, gene structure, transcripts, distance from transcription start site, neighboring genes, CpG islands, transposable elements, 3D genomic structure and functional annotations. Furthermore, transcriptome data from 73 single cells were analyzed and integrated to explore cell type-specific heterochromatin-related genes. HHCDB affords rich visualization through the UCSC Genome Browser and our self-developed tools. We have also developed a specialized online analysis platform to mine differential heterochromatin regions in cancers. We performed several analyses to explore the function of cancer-specific heterochromatin-related genes, including clinical feature analysis, immune cell infiltration analysis and the construction of drug-target networks. HHCDB is a valuable resource for studying epigenetic regulation, 3D genomics and heterochromatin regulation in development and disease. HHCDB is freely accessible at http://hhcdb.edbc.org/.
The Pathosystems Resource Integration Center (PATRIC) is the all-bacterial Bioinformatics Resource Center (BRC) (http://www.patricbrc.org). A joint effort by two of the original National Institute of Allergy and Infectious Diseases-funded BRCs, PATRIC provides researchers with an online resource that stores and integrates a variety of data types [e.g. genomics, transcriptomics, protein-protein interactions (PPIs), three-dimensional protein structures and sequence typing data] and associated metadata. Datatypes are summarized for individual genomes and across taxonomic levels. All genomes in PATRIC, currently more than 10,000, are consistently annotated using RAST, the Rapid Annotations using Subsystems Technology. Summaries of different data types are also provided for individual genes, where comparisons of different annotations are available, and also include available transcriptomic data. PATRIC provides a variety of ways for researchers to find data of interest and a private workspace where they can store both genomic and gene associations, and their own private data. Both private and public data can be analyzed together using a suite of tools to perform comparative genomic or transcriptomic analysis. PATRIC also includes integrated information related to disease and PPIs. All the data and integrated analysis and visualization tools are freely available. This manuscript describes updates to the PATRIC since its initial report in the 2007 NAR Database Issue.
High-throughput bisulfite sequencing is widely used to measure cytosine methylation at single-base resolution in eukaryotes. It permits systems-level analysis of genomic methylation patterns associated with gene expression and chromatin structure. However, methods for large-scale identification of methylation patterns from bisulfite sequencing are lacking. We developed a comprehensive tool, CpG_MPs, for identification and analysis of the methylation patterns of genomic regions from bisulfite sequencing data. CpG_MPs first normalizes bisulfite sequencing reads into methylation level of CpGs. Then it identifies unmethylated and methylated regions using the methylation status of neighboring CpGs by hotspot extension algorithm without knowledge of pre-defined regions. Furthermore, the conservatively and differentially methylated regions across paired or multiple samples (cells or tissues) are identified by combining a combinatorial algorithm with Shannon entropy. CpG_MPs identified large amounts of genomic regions with different methylation patterns across five human bisulfite sequencing data during cellular differentiation. Different sequence features and significantly cell-specific methylation patterns were observed. These potentially functional regions form candidate regions for functional analysis of DNA methylation during cellular differentiation. CpG_MPs is the first user-friendly tool for identifying methylation patterns of genomic regions from bisulfite sequencing data, permitting further investigation of the biological functions of genome-scale methylation patterns.
Various cancer genome projects are underway to identify novel mutations that drive tumorigenesis. While these screens will generate large data sets, the majority of identified missense changes are likely to be innocuous passenger mutations or polymorphisms. As a result, it has become increasingly important to develop computational methods for distinguishing functionally relevant mutations from other variations. We previously developed an algorithm, and now present the web application, CanPredict (http://www.canpredict.org/ or http://www.cgl.ucsf.edu/Research/genentech/canpredict/), to allow users to determine if particular changes are likely to be cancer-associated. The impact of each change is measured using two known methods: Sorting Intolerant From Tolerant (SIFT) and the Pfam-based LogR.E-value metric. A third method, the Gene Ontology Similarity Score (GOSS), provides an indication of how closely the gene in which the variant resides resembles other known cancer-causing genes. Scores from these three algorithms are analyzed by a random forest classifier which then predicts whether a change is likely to be cancer-associated. CanPredict fills an important need in cancer biology and will enable a large audience of biologists to determine which mutations are the most relevant for further study.
The accurate identification and description of the genes in the human and mouse genomes is a fundamental requirement for high quality analysis of data informing both genome biology and clinical genomics. Over the last 15 years, the GENCODE consortium has been producing reference quality gene annotations to provide this foundational resource. The GENCODE consortium includes both experimental and computational biology groups who work together to improve and extend the GENCODE gene annotation. Specifically, we generate primary data, create bioinformatics tools and provide analysis to support the work of expert manual gene annotators and automated gene annotation pipelines. In addition, manual and computational annotation workflows use any and all publicly available data and analysis, along with the research literature to identify and characterise gene loci to the highest standard. GENCODE gene annotations are accessible via the Ensembl and UCSC Genome Browsers, the Ensembl FTP site, Ensembl Biomart, Ensembl Perl and REST APIs as well as https://www.gencodegenes.org.
The detection of nucleic acid sequences in parallel with the discrimination of single nucleotide variations (SNVs) is critical for research and clinical applications. A few limitations make the detection technically challenging, such as too small variation in probe-hybridization energy caused by SNVs, the non-specific amplification of false nucleic acid fragments and the few options of dyes limited by spectral overlaps. To circumvent these limitations, we developed a single-molecule nucleic acid detection assay without amplification or fluorescence termed THREF (hybridization-induced tandem DNA hairpin refolding failure) based on multiplexed magnetic tweezers. THREF can detect DNA and RNA sequences at femtomolar concentrations within 30 min, monitor multiple probes in parallel, quantify the expression level of miR-122 in patient tissues, discriminate SNVs including the hard-to-detect G-U or T-G wobble mutations and reuse the probes to save the cost. In our demonstrative detections using mock clinic samples, we profiled the let-7 family microRNAs in serum and genotyped SARS-CoV-2 strains in saliva. Overall, the THREF assay can discriminate SNVs with the advantages of high sensitivity, ultra-specificity, multiplexing, reusability, sample hands-free and robustness.
Recently, Type III-A CRISPR-Cas systems were found to catalyze the synthesis of cyclic oligoadenylates (cOAs), a second messenger that specifically activates Csm6, a Cas accessory RNase and confers antiviral defense in bacteria. To test if III-B CRISPR-Cas systems could mediate a similar CRISPR signaling pathway, the Sulfolobus islandicus Cmr-α ribonucleoprotein complex (Cmr-α-RNP) was purified from the native host and tested for cOA synthesis. We found that the system showed a robust production of cyclic tetra-adenylate (c-A4), and that c-A4 functions as a second messenger to activate the III-B-associated RNase Csx1 by binding to its CRISPR-associated Rossmann Fold domain. Investigation of the kinetics of cOA synthesis revealed that Cmr-α-RNP displayed positively cooperative binding to the adenosine triphosphate (ATP) substrate. Furthermore, mutagenesis of conserved domains in Cmr2α confirmed that, while Palm 2 hosts the active site of cOA synthesis, Palm 1 domain serves as the primary site in the enzyme-substrate interaction. Together, our data suggest that the two Palm domains cooperatively interact with ATP molecules to achieve a robust cOA synthesis by the III-B CRISPR-Cas system.
DNA methylation is a key epigenetic mark that is critical for gene regulation in multicellular eukaryotes. Although various human cell types may have the same genome, these cells have different methylomes. The systematic identification and characterization of methylation marks across cell types are crucial to understand the complex regulatory network for cell fate determination. In this study, we proposed an entropy-based framework termed SMART to integrate the whole genome bisulfite sequencing methylomes across 42 human tissues/cells and identified 757 887 genome segments. Nearly 75% of the segments showed uniform methylation across all cell types. From the remaining 25% of the segments, we identified cell type-specific hypo/hypermethylation marks that were specifically hypo/hypermethylated in a minority of cell types using a statistical approach and presented an atlas of the human methylation marks. Further analysis revealed that the cell type-specific hypomethylation marks were enriched through H3K27ac and transcription factor binding sites in cell type-specific manner. In particular, we observed that the cell type-specific hypomethylation marks are associated with the cell type-specific super-enhancers that drive the expression of cell identity genes. This framework provides a complementary, functional annotation of the human genome and helps to elucidate the critical features and functions of cell type-specific hypomethylation.
Many circRNA transcriptome data were deposited in public resources, but these data show great heterogeneity. Researchers without bioinformatics skills have difficulty in investigating these invaluable data or their own data. Here, we specifically designed circMine (http://hpcc.siat.ac.cn/circmine and http://www.biomedical-web.com/circmine/) that provides 1 821 448 entries formed by 136 871 circRNAs, 87 diseases and 120 circRNA transcriptome datasets of 1107 samples across 31 human body sites. circMine further provides 13 online analytical functions to comprehensively investigate these datasets to evaluate the clinical and biological significance of circRNA. To improve the data applicability, each dataset was standardized and annotated with relevant clinical information. All of the 13 analytic functions allow users to group samples based on their clinical data and assign different parameters for different analyses, and enable them to perform these analyses using their own circRNA transcriptomes. Moreover, three additional tools were developed in circMine to systematically discover the circRNA-miRNA interaction and circRNA translatability. For example, we systematically discovered five potential translatable circRNAs associated with prostate cancer progression using circMine. In summary, circMine provides user-friendly web interfaces to browse, search, analyze and download data freely, and submit new data for further integration, and it can be an important resource to discover significant circRNA in different diseases.
Welcome to the FDI Lab - SciCrunch.org Resources search. From here you can search through a compilation of resources used by FDI Lab - SciCrunch.org and see how data is organized within our community.
You are currently on the Community Resources tab looking through categories and sources that FDI Lab - SciCrunch.org has compiled. You can navigate through those categories from here or change to a different tab to execute your search through. Each tab gives a different perspective on data.
If you have an account on FDI Lab - SciCrunch.org then you can log in from here to get additional features in FDI Lab - SciCrunch.org such as Collections, Saved Searches, and managing Resources.
Here is the search term that is being executed, you can type in anything you want to search for. Some tips to help searching:
You can save any searches you perform for quick access to later from here.
We recognized your search term and included synonyms and inferred terms along side your term to help get the data you are looking for.
If you are logged into FDI Lab - SciCrunch.org you can add data records to your collections to create custom spreadsheets across multiple sources of data.
Here are the facets that you can filter your papers by.
From here we'll present any options for the literature, such as exporting your current results.
If you have any further questions please check out our FAQs Page to ask questions and see our tutorials. Click this button to view this tutorial again.
Year:
Count: