• Register
X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X

Leaving Community

Are you sure you want to leave this community? Leaving the community will revoke any permissions you have been granted in this community.

No
Yes

Systematic bias in high-throughput sequencing data and its correction by BEADS.

Genomic sequences obtained through high-throughput sequencing are not uniformly distributed across the genome. For example, sequencing data of total genomic DNA show significant, yet unexpected enrichments on promoters and exons. This systematic bias is a particular problem for techniques such as chromatin immunoprecipitation, where the signal for a target factor is plotted across genomic features. We have focused on data obtained from Illumina's Genome Analyser platform, where at least three factors contribute to sequence bias: GC content, mappability of sequencing reads, and regional biases that might be generated by local structure. We show that relying on input control as a normalizer is not generally appropriate due to sample to sample variation in bias. To correct sequence bias, we present BEADS (bias elimination algorithm for deep sequencing), a simple three-step normalization scheme that successfully unmasks real binding patterns in ChIP-seq data. We suggest that this procedure be done routinely prior to data interpretation and downstream analyses.

Pubmed ID: 21646344

Authors

  • Cheung MS
  • Down TA
  • Latorre I
  • Ahringer J

Journal

Nucleic acids research

Publication Data

August 23, 2011

Associated Grants

  • Agency: Wellcome Trust, Id: 054523
  • Agency: Wellcome Trust, Id: 054523
  • Agency: NHGRI NIH HHS, Id: 1-U01-HG004270-01
  • Agency: Cancer Research UK, Id:

Mesh Terms

  • Algorithms
  • Animals
  • Base Composition
  • Caenorhabditis elegans
  • Chromatin Immunoprecipitation
  • DNA, Helminth
  • High-Throughput Nucleotide Sequencing
  • Sequence Analysis, DNA