Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Systematic elucidation and in vivo validation of sequences enriched in hindbrain transcriptional control.

Genome research | 2012

Illuminating the primary sequence encryption of enhancers is central to understanding the regulatory architecture of genomes. We have developed a machine learning approach to decipher motif patterns of hindbrain enhancers and identify 40,000 sequences in the human genome that we predict display regulatory control that includes the hindbrain. Consistent with their roles in hindbrain patterning, MEIS1, NKX6-1, as well as HOX and POU family binding motifs contributed strongly to this enhancer model. Predicted hindbrain enhancers are overrepresented at genes expressed in hindbrain and associated with nervous system development, and primarily reside in the areas of open chromatin. In addition, 77 (0.2%) of these predictions are identified as hindbrain enhancers on the VISTA Enhancer Browser, and 26,000 (60%) overlap enhancer marks (H3K4me1 or H3K27ac). To validate these putative hindbrain enhancers, we selected 55 elements distributed throughout our predictions and six low scoring controls for evaluation in a zebrafish transgenic assay. When assayed in mosaic transgenic embryos, 51/55 elements directed expression in the central nervous system. Furthermore, 30/34 (88%) predicted enhancers analyzed in stable zebrafish transgenic lines directed expression in the larval zebrafish hindbrain. Subsequent analysis of sequence fragments selected based upon motif clustering further confirmed the critical role of the motifs contributing to the classifier. Our results demonstrate the existence of a primary sequence code characteristic to hindbrain enhancers. This code can be accurately extracted using machine-learning approaches and applied successfully for de novo identification of hindbrain enhancers. This study represents a critical step toward the dissection of regulatory control in specific neuronal subtypes.

Pubmed ID: 22759862 RIS Download

Research resources used in this publication

None found

Antibodies used in this publication

None found

Associated grants

  • Agency: NIGMS NIH HHS, United States
    Id: T32 GM007814
  • Agency: NINDS NIH HHS, United States
    Id: R01 NS062972
  • Agency: Intramural NIH HHS, United States
  • Agency: NIGMS NIH HHS, United States
    Id: GM07814
  • Agency: NINDS NIH HHS, United States
    Id: NS062972

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.


JASPAR (tool)

RRID:SCR_003030

Open source database of curated, non-redundant set of profiles derived from published collections of experimentally defined transcription factor binding sites for multicellular eukaryotes. Consists of open data access, non-redundancy and quality. JASPAR CORE is smaller set that is non-redundant and curated. Collection of transcription factor DNA-binding preferences, modeled as matrices. These can be converted into Position Weight Matrices (PWMs or PSSMs), used for scanning genomic sequences. Web interface for browsing, searching and subset selection, online sequence analysis utility and suite of programming tools for genome-wide and comparative genomic analysis of regulatory regions. New functions include clustering of matrix models by similarity, generation of random matrices by sampling from selected sets of existing models and a language-independent Web Service applications programming interface for matrix retrieval.

View all literature mentions

Mouse Genome Informatics (MGI) (tool)

RRID:SCR_006460

International database for laboratory mouse. Data offered by The Jackson Laboratory includes information on integrated genetic, genomic, and biological data. MGI creates and maintains integrated representation of mouse genetic, genomic, expression, and phenotype data and develops reference data set and consensus data views, synthesizes comparative genomic data between mouse and other mammals, maintains set of links and collaborations with other bioinformatics resources, develops and supports analysis and data submission tools, and provides technical support for database users. Projects contributing to this resource are: Mouse Genome Database (MGD) Project, Gene Expression Database (GXD) Project, Mouse Tumor Biology (MTB) Database Project, Gene Ontology (GO) Project at MGI, and MouseCyc Project at MGI.

View all literature mentions

ENCODE (tool)

RRID:SCR_006793

Encyclopedia of DNA elements consisting of list of functional elements in human genome, including elements that act at protein and RNA levels, and regulatory elements that control cells and circumstances in which gene is active. Enables scientific and medical communities to interpret role of human genome in biology and disease. Provides identification of common cell types to facilitate integrative analysis and new experimental technologies based on high-throughput sequencing. Genome Browser containing ENCODE and Epigenomics Roadmap data. Data are available for entire human genome.

View all literature mentions