A signal-noise model for significance analysis of ChIP-seq with negative control.
MOTIVATION: ChIP-seq is becoming the main approach to the genome-wide study of protein-DNA interactions and histone modifications. Existing informatics tools perform well to extract strong ChIP-enriched sites. However, two questions remain to be answered: (i) to which extent is a ChIP-seq experiment able to reveal the weak ChIP-enriched sites? (ii) are the weak sites biologically meaningful? To answer these questions, it is necessary to identify the weak ChIP signals from background noise. RESULTS: We propose a linear signal-noise model, in which a noise rate was introduced to represent the fraction of noise in a ChIP library. We developed an iterative algorithm to estimate the noise rate using a control library, and derived a library-swapping strategy for the false discovery rate estimation. These approaches were integrated in a general-purpose framework, named CCAT (Control-based ChIP-seq Analysis Tool), for the significance analysis of ChIP-seq. Applications to H3K4me3 and H3K36me3 datasets showed that CCAT predicted significantly more ChIP-enriched sites that the previous methods did. With the high sensitivity of CCAT prediction, we revealed distinct chromatin features associated to the strong and weak H3K4me3 sites. AVAILABILITY: http://cmb.gis.a-star.edu.sg/ChIPSeq/tools.htm.
SciCrunch is a data sharing and display platform. Anyone can create a custom portal where they can select searchable subsets of hundreds of data sources, brand their web pages and create their community. SciCrunch will push data updates automatically to all portals on a weekly basis. User communities can also add their own data to scicrunch, however this is not currently a free service.