The reference human genome sequence set the stage for studies of genetic variation and its association with human disease, but epigenomic studies lack a similar reference. To address this need, the NIH Roadmap Epigenomics Consortium generated the largest collection so far of human epigenomes for primary cells and tissues. Here we describe the integrative analysis of 111 reference human epigenomes generated as part of the programme, profiled for histone modification patterns, DNA accessibility, DNA methylation and RNA expression. We establish global maps of regulatory elements, define regulatory modules of coordinated activity, and their likely activators and repressors. We show that disease- and trait-associated genetic variants are enriched in tissue-specific epigenomic marks, revealing biologically relevant cell types for diverse human traits, and providing a resource for interpreting the molecular basis of human disease. Our results demonstrate the central role of epigenomic information for understanding gene regulation, cellular differentiation and human disease.
Pubmed ID: 25693563 RIS Download
Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.
Software that identifies constrained elements in multiple alignments by quantifying substitution deficits. These deficits represent substitutions that would have occurred if the element were neutral DNA, but did not occur because the element has been under functional constraint. We refer to these deficits as Rejected Substitutions. Rejected substitutions are a natural measure of constraint that reflects the strength of past purifying selection on the element. GERP estimates constraint for each alignment column; elements are identified as excess aggregations of constrained columns. A false-positive rate (which is user-settable) is calculated using "shuffled" alignments in which the order of columns is randomized.
View all literature mentionsA software application and database viewing system for genomic research, more specifically formulti-genome comparison and pattern discovery via genome self-comparison. Data are available for a range of species including Human Chr3, Human Chr12, Sea Urchin, Tribolium, and cow. The Genboree Discovery System is the largest software system developed at the bioinformatics laboratory at Baylor in close collaboration with the Human Genome Sequencing Center. Genboree is a turnkey software system for genomic research. Genboree is hosted on the Internet and, as of early 2007, the number of registered users exceeds 600. While it can be configured to support almost any genome-centric discovery process, a number of configurations already exist for specific applications. Current focus is on enabling studies of genome variation, including array CGH studies, PCR-based resequencing, genome resequencing using comparative sequence assembly, genome remapping using paired-end tags and sequences, genome analysis and annotation, multi-genome comparison and pattern discovery via genome self-comparison. Genboree database and visualization settings, tools, and user roles are configurable to fit the needs of specific discovery processes. Private permanent project-specific databases can be accessed in a controlled way by collaborators via the Internet. Project-specific data is integrated with relevant data from public sources such as genome browsers and genomic databases. Data processing tools are integrated using a plug-in model. Genboree is extensible via flexible data-exchange formats to accommodate project specific tools and processing steps. Our Positional Hashing method, implemented in the Pash program, enables extremely fast and accurate sequence comparison and pattern discovery by employing low-level parallelism. Pash enables fast and sensitive detection of orthologous regions across mammalian genomes, and fast anchoring of hundreds of millions of short sequences produced by next-generation sequencing technologies. We are further developing the Pash program and employing it in the context of various discovery pipelines. Our laboratory participates in the pilot stage of the TCGA (The Cancer Genome Atlas) project. We aim to develop comprehensive, rapid, and economical methods for detecting recurrent chromosomal aberrations in cancer using next-generation sequencing technologies. The methods will allow detection of recurrent chromosomal aberrations in hundreds of small (
View all literature mentionsSoftware package that computes quick but highly informative enrichment and quality measures for ChIP-seq/DNase-seq/FAIRE-seq/MNase-seq data. It can also be used to obtain robust estimates of the predominant fragment length or characteristic tag shift values in these assays.
View all literature mentionsSoftware tool for visualizing and interacting with whole-genome datasets. Browser hosts Human Epigenome Atlas data produced by Roadmap Epigenomics project, but its use of advanced, multi-resolution data formats and its user-friendly interface make it possible for investigators to upload and visualize their own data as custom tracks. Developed and maintained by Epigenome Informatics Group at Washington University in St. Louis.
View all literature mentionsTHIS RESOURCE IS NO LONGER IN SERVICE. Documented on July 11, 2022. Project for human epigenomic data from experimental pipelines built around next-generation sequencing technologies to map DNA methylation, histone modifications, chromatin accessibility and small RNA transcripts in stem cells and primary ex vivo tissues selected to represent normal counterparts of tissues and organ systems frequently involved in human disease. Consortium expects to deliver collection of normal epigenomes that will provide framework or reference for comparison and integration within broad array of future studies. Consortium is also committed to development, standardization and dissemination of protocols, reagents and analytical tools to enable research community to utilize, integrate and expand upon this body of data.
View all literature mentionsOpen source software package for statistical programming language R to create plots based on grammar of graphics. Used for data visualization to break up graphs into semantic components such as scales and layers.
View all literature mentionsSet of software modules for performing common ChIP-seq data analysis tasks across the whole genome, including positional correlation analysis, peak detection, and genome partitioning into signal-rich and signal-poor regions. The tools are designed to be simple, fast and highly modular. Each program carries out a well defined data processing procedure that can potentially fit into a pipeline framework. ChIP-Seq is also freely available on a Web interface.
View all literature mentionsSoftware application that can be used for converting Eland, Maq (.map), BED or other files into WIG files and identifying areas of enrichment (ChIP-Seq analysis).
View all literature mentionsSoftware tool that screens DNA sequences for interspersed repeats and low complexity DNA sequences. The output of the program is a detailed annotation of the repeats that are present in the query sequence as well as a modified version of the query sequence in which all the annotated repeats have been masked (default: replaced by Ns). Currently over 56% of human genomic sequence is identified and masked by the program. Sequence comparisons in RepeatMasker are performed by one of several popular search engines including nhmmer, cross_match, ABBlast/WUBlast, RMBlast and Decypher. RepeatMasker makes use of curated libraries of repeats and currently supports Dfam ( profile HMM library ) and RepBase ( consensus sequence library ).
View all literature mentionsCell line K-562 is a Cancer cell line with a species of origin Homo sapiens (Human)
View all literature mentionsCell line GM12878 is a Transformed cell line with a species of origin Homo sapiens (Human)
View all literature mentions