GimmeMotifs: a de novo motif prediction pipeline for ChIP-sequencing experiments
Open Access
- 15 November 2010
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 27 (2), 270-271
- https://doi.org/10.1093/bioinformatics/btq636
Abstract
Summary: Accurate prediction of transcription factor binding motifs that are enriched in a collection of sequences remains a computational challenge. Here we report on GimmeMotifs, a pipeline that incorporates an ensemble of computational tools to predict motifs de novo from ChIP-sequencing (ChIP-seq) data. Similar redundant motifs are compared using the weighted information content (WIC) similarity score and clustered using an iterative procedure. A comprehensive output report is generated with several different evaluation metrics to compare and evaluate the results. Benchmarks show that the method performs well on human and mouse ChIP-seq datasets. GimmeMotifs consists of a suite of command-line scripts that can be easily implemented in a ChIP-seq analysis pipeline. Availability: GimmeMotifs is implemented in Python and runs on Linux. The source code is freely available for download at http://www.ncmls.eu/bioinfo/gimmemotifs/. Contact:s.vanheeringen@ncmls.ru.nl Supplementary Information: Supplementary data are available at Bioinformatics online.Keywords
This publication has 9 references indexed in Scilit:
- Genome-Wide Profiling of p63 DNA–Binding Sites Identifies an Element that Regulates Gene Expression during Limb Development in the 7q21 SHFM1 LocusPLoS Genetics, 2010
- W-ChIPMotifs: a web application tool for de novo motif discovery from ChIP-based high-throughput dataBioinformatics, 2009
- ChIP–seq: advantages and challenges of a maturing technologyNature Reviews Genetics, 2009
- SCOPE: a web server for practical de novo motif discoveryNucleic Acids Research, 2007
- Limitations and potentials of current motif discovery algorithmsNucleic Acids Research, 2005
- Assessing computational tools for the discovery of transcription factor binding sitesNature Biotechnology, 2005
- JASPAR: an open-access database for eukaryotic transcription factor binding profilesNucleic Acids Research, 2004
- Rank order metrics for quantifying the association of sequence features with gene regulationBioinformatics, 2003
- Sequence logos: a new way to display consensus sequencesNucleic Acids Research, 1990