A practical comparison of methods for detecting transcription factor binding sites in ChIP-seq experiments

Open Access

18 December 2009

journal article
research article
Published by Springer Nature in BMC Genomics

Vol. 10 (1), 618
https://doi.org/10.1186/1471-2164-10-618

Abstract

Background: Chromatin immunoprecipitation coupled with massively parallel sequencing (ChIP-seq) is increasingly being applied to study transcriptional regulation on a genome-wide scale. While numerous algorithms have recently been proposed for analysing the large ChIP-seq datasets, their relative merits and potential limitations remain unclear in practical applications. Results: The present study compares the state-of-the-art algorithms for detecting transcription factor binding sites in four diverse ChIP-seq datasets under a variety of practical research settings. First, we demonstrate how the biological conclusions may change dramatically when the different algorithms are applied. The reproducibility across biological replicates is then investigated as an internal validation of the detections. Finally, the predicted binding sites with each method are compared to high-scoring binding motifs as well as binding regions confirmed in independent qPCR experiments. Conclusions: In general, our results indicate that the optimal choice of the computational approach depends heavily on the dataset under analysis. In addition to revealing valuable information to the users of this technology about the characteristics of the binding site detection approaches, the systematic evaluation framework provides also a useful reference to the developers of improved algorithms for ChIP-seq data.

Keywords

This publication has 32 references indexed in Scilit:

An integrated software system for analyzing ChIP-chip and ChIP-seq data
Nature Biotechnology, 2008
Next-generation DNA sequencing
Nature Biotechnology, 2008
Model-based Analysis of ChIP-Seq (MACS)
Genome Biology, 2008
Genome-wide analysis of transcription factor binding sites based on ChIP-Seq data
Nature Methods, 2008
Genome-wide identification ofin vivoprotein-DNA binding sites from ChIP-Seq data
Nucleic Acids Research, 2008
FindPeaks 3.1: a tool for identifying areas of enrichment from massively parallel short-read sequencing technology
Bioinformatics, 2008
GeneTrack—a genomic data processing and visualization framework
Bioinformatics, 2008
ChIPping away at gene regulation
EMBO Reports, 2008
Genome-Wide Mapping of in Vivo Protein-DNA Interactions
Science, 2007
Genome-Wide Analysis of Protein-DNA Interactions
Annual Review of Genomics and Human Genetics, 2006

Cited by 100 articles