The Encyclopedia of DNA Elements (ENCODE) project has established a genomic resource for mammalian development, profiling a diverse panel of mouse tissues at 8 developmental stages from 10.5 days after conception until birth, including transcriptomes, methylomes and chromatin states. Here we systematically examined the state and accessibility of chromatin in the developing mouse fetus. In total we performed 1,128 chromatin immunoprecipitation with sequencing (ChIP-seq) assays for histone modifications and 132 assay for transposase-accessible chromatin using sequencing (ATAC-seq) assays for chromatin accessibility across 72 distinct tissue-stages. We used integrative analysis to develop a unified set of chromatin state annotations, infer the identities of dynamic enhancers and key transcriptional regulators, and characterize the relationship between chromatin state and accessibility during developmental gene regulation. We also leveraged these data to link enhancers to putative target genes and demonstrate tissue-specific enrichments of sequence variants associated with disease in humans. The mouse ENCODE data sets provide a compendium of resources for biomedical researchers and achieve, to our knowledge, the most comprehensive view of chromatin dynamics during mammalian fetal development to date.
Pubmed ID: 32728240 RIS Download
Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.
Open source whole genome association analysis toolset, designed to perform range of basic, large scale analyses in computationally efficient manner. Used for analysis of genotype/phenotype data. Through integration with gPLINK and Haploview, there is some support for subsequent visualization, annotation and storage of results. PLINK 1.9 is improved and second generation of the software.
View all literature mentionsSoftware environment and programming language for statistical computing and graphics. R is integrated suite of software facilities for data manipulation, calculation and graphical display. Can be extended via packages. Some packages are supplied with the R distribution and more are available through CRAN family.It compiles and runs on wide variety of UNIX platforms, Windows and MacOS.
View all literature mentionsOriginal SAMTOOLS package has been split into three separate repositories including Samtools, BCFtools and HTSlib. Samtools for manipulating next generation sequencing data used for reading, writing, editing, indexing,viewing nucleotide alignments in SAM,BAM,CRAM format. BCFtools used for reading, writing BCF2,VCF, gVCF files and calling, filtering, summarising SNP and short indel sequence variants. HTSlib used for reading, writing high throughput sequencing data.
View all literature mentionsWeb tool to search, sort, analyze, visualize and download data of interest. Along with providing details of the ontologies, gene products and annotations, features a BLAST search, Term Enrichment and GO Slimmer tools, the GO Online SQL Environment and a user help guide.Used at the Gene Ontology (GO) website to access the data provided by the GO Consortium. Developed and maintained by the GO Consortium.
View all literature mentionsCollection of genome databases for vertebrates and other eukaryotic species with DNA and protein sequence search capabilities. Used to automatically annotate genome, integrate this annotation with other available biological data and make data publicly available via web. Ensembl tools include BLAST, BLAT, BioMart and the Variant Effect Predictor (VEP) for all supported species.
View all literature mentionsInternational functional genomics data collection generated from microarray or next-generation sequencing (NGS) platforms. Repository of functional genomics data supporting publications. Provides genes expression data for reuse to the research community where they can be queried and downloaded. Integrated with the Gene Expression Atlas and the sequence databases at the European Bioinformatics Institute. Contains a subset of curated and re-annotated Archive data which can be queried for individual gene expression under different biological conditions across experiments. Data collected to MIAME and MINSEQE standards. Data are submitted by users or are imported directly from the NCBI Gene Expression Omnibus.
View all literature mentionsCollection of curated, non-redundant genomic DNA, transcript RNA, and protein sequences produced by NCBI. Provides a reference for genome annotation, gene identification and characterization, mutation and polymorphism analysis, expression studies, and comparative analyses. Accessed through the Nucleotide and Protein databases.
View all literature mentionsCommercial organism provider selling mice, rats and other model animals. American corporation specializing in a variety of pre-clinical and clinical laboratory services for the pharmaceutical, medical device and biotechnology industries. It also supplies assorted biomedical products and research and development outsourcing services for use in the pharmaceutical industry. (Wikipedia)
View all literature mentionsSoftware ultrafast memory efficient tool for aligning sequencing reads. Bowtie is short read aligner.
View all literature mentionsOnline catalog of human genes and genetic disorders, for clinical features, phenotypes and genes. Collection of human genes and genetic phenotypes, focusing on relationship between phenotype and genotype. Referenced overviews in OMIM contain information on all known mendelian disorders and variety of related genes. It is updated daily, and entries contain copious links to other genetics resources.
View all literature mentionsSoftware repository for R packages related to analysis and comprehension of high throughput genomic data. Uses separate set of commands for installation of packages. Software project based on R programming language that provides tools for analysis and comprehension of high throughput genomic data.
View all literature mentionsA powerful toolset for genome arithmetic allowing one to address common genomics tasks such as finding feature overlaps and computing coverage. Bedtools allows one to intersect, merge, count, complement, and shuffle genomic intervals from multiple files in widely-used genomic file formats such as BAM, BED, GFF/GTF, VCF. While each individual tool is designed to do a relatively simple task (e.g., intersect two interval files), quite sophisticated analyses can be conducted by combining multiple bedtools operations on the UNIX command line.
View all literature mentionsEncyclopedia of DNA elements consisting of list of functional elements in human genome, including elements that act at protein and RNA levels, and regulatory elements that control cells and circumstances in which gene is active. Enables scientific and medical communities to interpret role of human genome in biology and disease. Provides identification of common cell types to facilitate integrative analysis and new experimental technologies based on high-throughput sequencing. Genome Browser containing ENCODE and Epigenomics Roadmap data. Data are available for entire human genome.
View all literature mentionsInternational collaboration producing an extensive public catalog of human genetic variation, including SNPs and structural variants, and their haplotype contexts, in an effort to provide a foundation for investigating the relationship between genotype and phenotype. The genomes of about 2500 unidentified people from about 25 populations around the world were sequenced using next-generation sequencing technologies. Redundant sequencing on various platforms and by different groups of scientists of the same samples can be compared. The results of the study are freely and publicly accessible to researchers worldwide. The consortium identified the following populations whose DNA will be sequenced: Yoruba in Ibadan, Nigeria; Japanese in Tokyo; Chinese in Beijing; Utah residents with ancestry from northern and western Europe; Luhya in Webuye, Kenya; Maasai in Kinyawa, Kenya; Toscani in Italy; Gujarati Indians in Houston; Chinese in metropolitan Denver; people of Mexican ancestry in Los Angeles; and people of African ancestry in the southwestern United States. The goal Project is to find most genetic variants that have frequencies of at least 1% in the populations studied. Sequencing is still too expensive to deeply sequence the many samples being studied for this project. However, any particular region of the genome generally contains a limited number of haplotypes. Data can be combined across many samples to allow efficient detection of most of the variants in a region. The Project currently plans to sequence each sample to about 4X coverage; at this depth sequencing cannot provide the complete genotype of each sample, but should allow the detection of most variants with frequencies as low as 1%. Combining the data from 2500 samples should allow highly accurate estimation (imputation) of the variants and genotypes for each sample that were not seen directly by the light sequencing. All samples from the 1000 genomes are available as lymphoblastoid cell lines (LCLs) and LCL derived DNA from the Coriell Cell Repository as part of the NHGRI Catalog. The sequence and alignment data generated by the 1000genomes project is made available as quickly as possible via their mirrored ftp sites. ftp://ftp.1000genomes.ebi.ac.uk ftp://ftp-trace.ncbi.nlm.nih.gov/1000genomes
View all literature mentionsResource for experimentally validated human and mouse noncoding fragments with gene enhancer activity as assessed in transgenic mice. Most of these noncoding elements were selected for testing based on their extreme conservation in other vertebrates or epigenomic evidence (ChIP-Seq) of putative enhancer marks. Central public database of experimentally validated human and mouse noncoding fragments with gene enhancer activity as assessed in transgenic mice. Users can retrieve elements near single genes of interest, search for enhancers that target reporter gene expression to particular tissue, or download entire collections of enhancers with defined tissue specificity or conservation depth.
View all literature mentionsSoftware package for the analysis of gene expression microarray data, especially the use of linear models for analyzing designed experiments and the assessment of differential expression.
View all literature mentionsA cloud-based platform to support genomics at your organization.
View all literature mentionsDatabase that classifies human transcription factors based on the characteristics of their DNA-binding domains. It comprises six levels (superclasses, classes, families, subfamilies, genera and factor species), two of which are optional (subfamilies and factor species). The full classification can also be obtained as html document and as ontology in obo-format.
View all literature mentionsSoftware tool that screens DNA sequences for interspersed repeats and low complexity DNA sequences. The output of the program is a detailed annotation of the repeats that are present in the query sequence as well as a modified version of the query sequence in which all the annotated repeats have been masked (default: replaced by Ns). Currently over 56% of human genomic sequence is identified and masked by the program. Sequence comparisons in RepeatMasker are performed by one of several popular search engines including nhmmer, cross_match, ABBlast/WUBlast, RMBlast and Decypher. RepeatMasker makes use of curated libraries of repeats and currently supports Dfam ( profile HMM library ) and RepBase ( consensus sequence library ).
View all literature mentionsHuman and mouse genome annotation project which aims to identify all gene features in the human genome using computational analysis, manual annotation, and experimental validation.
View all literature mentionsConsortium to build comprehensive parts list of functional elements in human genome. This includes elements that act at protein and RNA levels, and regulatory elements that control cells and circumstances in which gene is active. Data from 2012-present.
View all literature mentionsSoftware package for differential gene expression analysis based on the negative binomial distribution. Used for analyzing RNA-seq data for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates.
View all literature mentionsAlignment analysis software tool for comparative mapping between two genome assemblies or between two different genomes. It can cache intermediate results to speed a comparisons of multiple sequences.
View all literature mentionsWeb tool to convert genome coordinates and genome annotation files between assemblies. Used to translate genomic coordinates from one assembly version into another and retrieves putative orthologous regions in other species using UCSC chained and netted alignments.
View all literature mentionsCore facility established to assist the Salk community with integrating genomics data into their research. The primary focus of the core is to provide analysis support for next-generation sequencing applications.
View all literature mentionsMus musculus with name FVB/NCrl from IMSR.
View all literature mentionsMus musculus with name C57BL/6NCrl from IMSR.
View all literature mentionslaboratory mouse with name C57BL/6NTac from MGI.
View all literature mentionslaboratory mouse with name C57BL/6N from MGI.
View all literature mentions