Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

Systematic bias in high-throughput sequencing data and its correction by BEADS.

Genomic sequences obtained through high-throughput sequencing are not uniformly distributed across the genome. For example, sequencing data of total genomic DNA show significant, yet unexpected enrichments on promoters and exons. This systematic bias is a particular problem for techniques such as chromatin immunoprecipitation, where the signal for a target factor is plotted across genomic features. We have focused on data obtained from Illumina's Genome Analyser platform, where at least three factors contribute to sequence bias: GC content, mappability of sequencing reads, and regional biases that might be generated by local structure. We show that relying on input control as a normalizer is not generally appropriate due to sample to sample variation in bias. To correct sequence bias, we present BEADS (bias elimination algorithm for deep sequencing), a simple three-step normalization scheme that successfully unmasks real binding patterns in ChIP-seq data. We suggest that this procedure be done routinely prior to data interpretation and downstream analyses.

Pubmed ID: 21646344


  • Cheung MS
  • Down TA
  • Latorre I
  • Ahringer J


Nucleic acids research

Publication Data

August 23, 2011

Associated Grants

  • Agency: Wellcome Trust, Id: 054523
  • Agency: Wellcome Trust, Id: 054523
  • Agency: NHGRI NIH HHS, Id: 1-U01-HG004270-01
  • Agency: Cancer Research UK, Id:

Mesh Terms

  • Algorithms
  • Animals
  • Base Composition
  • Caenorhabditis elegans
  • Chromatin Immunoprecipitation
  • DNA, Helminth
  • High-Throughput Nucleotide Sequencing
  • Sequence Analysis, DNA