• Register
X
Forgot Password

If you have forgotten your password you can enter your email here and get a temporary password sent to your email.

X

Leaving Community

Are you sure you want to leave this community? Leaving the community will revoke any permissions you have been granted in this community.

No
Yes

A signal-noise model for significance analysis of ChIP-seq with negative control.

MOTIVATION: ChIP-seq is becoming the main approach to the genome-wide study of protein-DNA interactions and histone modifications. Existing informatics tools perform well to extract strong ChIP-enriched sites. However, two questions remain to be answered: (i) to which extent is a ChIP-seq experiment able to reveal the weak ChIP-enriched sites? (ii) are the weak sites biologically meaningful? To answer these questions, it is necessary to identify the weak ChIP signals from background noise. RESULTS: We propose a linear signal-noise model, in which a noise rate was introduced to represent the fraction of noise in a ChIP library. We developed an iterative algorithm to estimate the noise rate using a control library, and derived a library-swapping strategy for the false discovery rate estimation. These approaches were integrated in a general-purpose framework, named CCAT (Control-based ChIP-seq Analysis Tool), for the significance analysis of ChIP-seq. Applications to H3K4me3 and H3K36me3 datasets showed that CCAT predicted significantly more ChIP-enriched sites that the previous methods did. With the high sensitivity of CCAT prediction, we revealed distinct chromatin features associated to the strong and weak H3K4me3 sites. AVAILABILITY: http://cmb.gis.a-star.edu.sg/ChIPSeq/tools.htm.

Pubmed ID: 20371496

Authors

  • Xu H
  • Handoko L
  • Wei X
  • Ye C
  • Sheng J
  • Wei CL
  • Lin F
  • Sung WK

Journal

Bioinformatics (Oxford, England)

Publication Data

May 1, 2010

Associated Grants

None

Mesh Terms

  • Algorithms
  • Binding Sites
  • Chromatin Immunoprecipitation
  • Computational Biology
  • Computer Simulation
  • Gene Expression Regulation
  • Genome
  • Histones
  • Models, Statistical
  • Poisson Distribution
  • Reproducibility of Results
  • Software