FIMO: scanning for occurrences of a given motif

Top Cited Papers

Open Access

16 February 2011

journal article
research article
Published by Oxford University Press (OUP) in Bioinformatics

Vol. 27 (7), 1017-1018
https://doi.org/10.1093/bioinformatics/btr064

Abstract

Summary: A motif is a short DNA or protein sequence that contributes to the biological function of the sequence in which it resides. Over the past several decades, many computational methods have been described for identifying, characterizing and searching with sequence motifs. Critical to nearly any motif-based sequence analysis pipeline is the ability to scan a sequence database for occurrences of a given motif described by a position-specific frequency matrix. Results: We describe Find Individual Motif Occurrences (FIMO), a software tool for scanning DNA or protein sequences with motifs described as position-specific scoring matrices. The program computes a log-likelihood ratio score for each position in a given sequence database, uses established dynamic programming methods to convert this score to a P-value and then applies false discovery rate analysis to estimate a q-value for each position in the given sequence. FIMO provides output in a variety of formats, including HTML, XML and several Santa Cruz Genome Browser formats. The program is efficient, allowing for the scanning of DNA sequences at a rate of 3.5 Mb/s on a single CPU. Availability and Implementation: FIMO is part of the MEME Suite software toolkit. A web server and source code are available at http://meme.sdsc.edu. Contact:t.bailey@imb.uq.edu.au; t.bailey@imb.uq.edu.au Supplementary information: Supplementary data are available at Bioinformatics online.

Keywords

This publication has 8 references indexed in Scilit:

CTCF: Master Weaver of the Genome
Cell, 2009
MEME SUITE: tools for motif discovery and searching
Nucleic Acids Research, 2009
CisML: an XML-based format for sequence motif detection software
Bioinformatics, 2004
The positive false discovery rate: a Bayesian interpretation and the q-value
The Annals of Statistics, 2003
Searching for statistically significant regulatory modules
Bioinformatics, 2003
A Direct Approach to False Discovery Rates
Journal of the Royal Statistical Society Series B: Statistical Methodology, 2002
Combining evidence using p-values: application to sequence homology searches.
Bioinformatics, 1998
Staden: Searching for Motifs in Nucleic Acid Sequences
Published by Springer Science and Business Media LLC ,1994

Cited by 3790 articles