• Register
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.


Leaving Community

Are you sure you want to leave this community? Leaving the community will revoke any permissions you have been granted in this community.


A signal-noise model for significance analysis of ChIP-seq with negative control.

MOTIVATION: ChIP-seq is becoming the main approach to the genome-wide study of protein-DNA interactions and histone modifications. Existing informatics tools perform well to extract strong ChIP-enriched sites. However, two questions remain to be answered: (i) to which extent is a ChIP-seq experiment able to reveal the weak ChIP-enriched sites? (ii) are the weak sites biologically meaningful? To answer these questions, it is necessary to identify the weak ChIP signals from background noise. RESULTS: We propose a linear signal-noise model, in which a noise rate was introduced to represent the fraction of noise in a ChIP library. We developed an iterative algorithm to estimate the noise rate using a control library, and derived a library-swapping strategy for the false discovery rate estimation. These approaches were integrated in a general-purpose framework, named CCAT (Control-based ChIP-seq Analysis Tool), for the significance analysis of ChIP-seq. Applications to H3K4me3 and H3K36me3 datasets showed that CCAT predicted significantly more ChIP-enriched sites that the previous methods did. With the high sensitivity of CCAT prediction, we revealed distinct chromatin features associated to the strong and weak H3K4me3 sites. AVAILABILITY: http://cmb.gis.a-star.edu.sg/ChIPSeq/tools.htm.

Pubmed ID: 20371496


  • Xu H
  • Handoko L
  • Wei X
  • Ye C
  • Sheng J
  • Wei CL
  • Lin F
  • Sung WK


Bioinformatics (Oxford, England)

Publication Data

May 1, 2010

Associated Grants


Mesh Terms

  • Algorithms
  • Binding Sites
  • Chromatin Immunoprecipitation
  • Computational Biology
  • Computer Simulation
  • Gene Expression Regulation
  • Genome
  • Histones
  • Models, Statistical
  • Poisson Distribution
  • Reproducibility of Results
  • Software