Recent genome-wide chromatin immunoprecipitation coupled high throughput sequencing (ChIP-seq) analyses performed in various eukaryotic organisms, analysed RNA Polymerase II (Pol II) pausing around the transcription start sites of genes. In this study we have further investigated genome-wide binding of Pol II downstream of the 3' end of the annotated genes (EAGs) by ChIP-seq in human cells. At almost all expressed genes we observed Pol II occupancy downstream of the EAGs suggesting that Pol II pausing 3' from the transcription units is a rather common phenomenon. Downstream of EAGs Pol II transcripts can also be detected by global run-on and sequencing, suggesting the presence of functionally active Pol II. Based on Pol II occupancy downstream of EAGs we could distinguish distinct clusters of Pol II pause patterns. On core histone genes, coding for non-polyadenylated transcripts, Pol II occupancy is quickly dropping after the EAG. In contrast, on genes, whose transcripts undergo polyA tail addition [poly(A)(+)], Pol II occupancy downstream of the EAGs can be detected up to 4-6 kb. Inhibition of polyadenylation significantly increased Pol II occupancy downstream of EAGs at poly(A)(+) genes, but not at the EAGs of core histone genes. The differential genome-wide Pol II occupancy profiles 3' of the EAGs have also been confirmed in mouse embryonic stem (mES) cells, indicating that Pol II pauses genome-wide downstream of the EAGs in mammalian cells. Moreover, in mES cells the sharp drop of Pol II signal at the EAG of core histone genes seems to be independent of the phosphorylation status of the C-terminal domain of the large subunit of Pol II. Thus, our study uncovers a potential link between different mRNA 3' end processing mechanisms and consequent Pol II transcription termination processes.
Pubmed ID: 22701709 RIS Download
Publication data is provided by the National Library of Medicine ® and PubMed ®. Data is retrieved from PubMed ® on a weekly schedule. For terms and conditions see the National Library of Medicine Terms and Conditions.
Software environment and programming language for statistical computing and graphics. R is integrated suite of software facilities for data manipulation, calculation and graphical display. Can be extended via packages. Some packages are supplied with the R distribution and more are available through CRAN family.It compiles and runs on wide variety of UNIX platforms, Windows and MacOS.
View all literature mentionsFunctional genomics data repository supporting MIAME-compliant data submissions. Includes microarray-based experiments measuring the abundance of mRNA, genomic DNA, and protein molecules, as well as non-array-based technologies such as serial analysis of gene expression (SAGE) and mass spectrometry proteomic technology. Array- and sequence-based data are accepted. Collection of curated gene expression DataSets, as well as original Series and Platform records. The database can be searched using keywords, organism, DataSet type and authors. DataSet records contain additional resources including cluster tools and differential expression queries.
View all literature mentionsSet of software modules for performing common ChIP-seq data analysis tasks across the whole genome, including positional correlation analysis, peak detection, and genome partitioning into signal-rich and signal-poor regions. The tools are designed to be simple, fast and highly modular. Each program carries out a well defined data processing procedure that can potentially fit into a pipeline framework. ChIP-Seq is also freely available on a Web interface.
View all literature mentionsCell line MCF-7 is a Cancer cell line with a species of origin Homo sapiens (Human)
View all literature mentions