Searching across hundreds of databases

Our searching services are busy right now. Your search will reload in five seconds.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Systematic analysis of transcription start sites in avian development.

PLoS biology | 2017

Cap Analysis of Gene Expression (CAGE) in combination with single-molecule sequencing technology allows precision mapping of transcription start sites (TSSs) and genome-wide capture of promoter activities in differentiated and steady state cell populations. Much less is known about whether TSS profiling can characterize diverse and non-steady state cell populations, such as the approximately 400 transitory and heterogeneous cell types that arise during ontogeny of vertebrate animals. To gain such insight, we used the chick model and performed CAGE-based TSS analysis on embryonic samples covering the full 3-week developmental period. In total, 31,863 robust TSS peaks (>1 tag per million [TPM]) were mapped to the latest chicken genome assembly, of which 34% to 46% were active in any given developmental stage. ZENBU, a web-based, open-source platform, was used for interactive data exploration. TSSs of genes critical for lineage differentiation could be precisely mapped and their activities tracked throughout development, suggesting that non-steady state and heterogeneous cell populations are amenable to CAGE-based transcriptional analysis. Our study also uncovered a large set of extremely stable housekeeping TSSs and many novel stage-specific ones. We furthermore demonstrated that TSS mapping could expedite motif-based promoter analysis for regulatory modules associated with stage-specific and housekeeping genes. Finally, using Brachyury as an example, we provide evidence that precise TSS mapping in combination with Clustered Regularly Interspaced Short Palindromic Repeat (CRISPR)-on technology enables us, for the first time, to efficiently target endogenous avian genes for transcriptional activation. Taken together, our results represent the first report of genome-wide TSS mapping in birds and the first systematic developmental TSS analysis in any amniote species (birds and mammals). By facilitating promoter-based molecular analysis and genetic manipulation, our work also underscores the value of avian models in unravelling the complex regulatory mechanism of cell lineage specification during amniote development.

Pubmed ID: 28873399 RIS Download

Research resources used in this publication

None found

Antibodies used in this publication

None found

Associated grants

None

Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.

This is a list of tools and resources that we have found mentioned in this publication.


MEME Suite - Motif-based sequence analysis tools (tool)

RRID:SCR_001783

Suite of motif-based sequence analysis tools to discover motifs using MEME, DREME (DNA only) or GLAM2 on groups of related DNA or protein sequences; search sequence databases with motifs using MAST, FIMO, MCAST or GLAM2SCAN; compare a motif to all motifs in a database of motifs; associate motifs with Gene Ontology terms via their putative target genes, and analyze motif enrichment using SpaMo or CentriMo. Source code, binaries and a web server are freely available for noncommercial use.

View all literature mentions

JASPAR (tool)

RRID:SCR_003030

Open source database of curated, non-redundant set of profiles derived from published collections of experimentally defined transcription factor binding sites for multicellular eukaryotes. Consists of open data access, non-redundancy and quality. JASPAR CORE is smaller set that is non-redundant and curated. Collection of transcription factor DNA-binding preferences, modeled as matrices. These can be converted into Position Weight Matrices (PWMs or PSSMs), used for scanning genomic sequences. Web interface for browsing, searching and subset selection, online sequence analysis utility and suite of programming tools for genome-wide and comparative genomic analysis of regulatory regions. New functions include clustering of matrix models by similarity, generation of random matrices by sampling from selected sets of existing models and a language-independent Web Service applications programming interface for matrix retrieval.

View all literature mentions

RefSeq (tool)

RRID:SCR_003496

Collection of curated, non-redundant genomic DNA, transcript RNA, and protein sequences produced by NCBI. Provides a reference for genome annotation, gene identification and characterization, mutation and polymorphism analysis, expression studies, and comparative analyses. Accessed through the Nucleotide and Protein databases.

View all literature mentions

Bioconductor (tool)

RRID:SCR_006442

Software repository for R packages related to analysis and comprehension of high throughput genomic data. Uses separate set of commands for installation of packages. Software project based on R programming language that provides tools for analysis and comprehension of high throughput genomic data.

View all literature mentions

CAGE (tool)

RRID:SCR_007574

Expression profiling and promoter identification software tool for transcriptional network analysis and transcriptome characterization. DeepCAGE, the combination of next-generation sequencing with next generation expression profiling provides unsurpassed solutions for expression profiling and genome annotation. CAGE will be the experimental approach at need to link gene expression and control regions in the genome. With the availability of next-generation sequencing methods, DNAFORM now offers DeepCAGE services. DeepCAGE libraries are prepared for direct analysis by an Illumina/Solexa Sequencer. One sequencing run using one channel on an Illumina/Solexa Sequencer can yield in over 4,000,000 reads per sample. CAGE is based on our full-length cDNA library technology, where an adaptor is ligated to the 5''''-end of full-length cDNAs, which introduces a recognition site for a Class IIs restriction endonuclease adjacent to the 5''''-end of the cDNA. The Class IIs restriction endonuclease, here MmeI, allows for the cloning of short tags as derived from the 5''''-end of transcripts into concatemers for high-throughput sequencing. CAGE tags are further characterized by mapping to genomic sequences, which enables the identification of transcriptional start sites. As such CAGE can contribute to projects in Gene Discovery, Gene Expression, and Promoter Identification. After the genome sequencing projects have provided us with the genetic blueprints for many organisms, new questions have to be answered on how to correlate the observed genotypes with related phenotypes, and how to understand the regulation of genetic information in time and space. The dynamics of living systems and the functional behavior of cells in multicellular organisms has thus become the subject of the emerging field of system biology. Integration of experimental approaches and computer aided theories on a system level will be the fundamental principle to drive systems biology in order to understand the principles behind complex regulatory networks, which will be an ambitious goal requiring new approaches in life sciences. For ordering and additional information, please contact us under contact_at_dnaform.jp

View all literature mentions

edgeR (tool)

RRID:SCR_012802

Bioconductor software package for Empirical analysis of Digital Gene Expression data in R. Used for differential expression analysis of RNA-seq and digital gene expression data with biological replication.

View all literature mentions