Exhaustive Search for Over-represented DNA Sequence Motifs with CisFinder
Open Access
- 9 September 2009
- journal article
- research article
- Published by Oxford University Press (OUP) in DNA Research
- Vol. 16 (5), 261-273
- https://doi.org/10.1093/dnares/dsp014
Abstract
We present CisFinder software, which generates a comprehensive list of motifs enriched in a set of DNA sequences and describes them with position frequency matrices (PFMs). A new algorithm was designed to estimate PFMs directly from counts of n-mer words with and without gaps; then PFMs are extended over gaps and flanking regions and clustered to generate non-redundant sets of motifs. The algorithm successfully identified binding motifs for 12 transcription factors (TFs) in embryonic stem cells based on published chromatin immunoprecipitation sequencing data. Furthermore, CisFinder successfully identified alternative binding motifs of TFs (e.g. POU5F1, ESRRB, and CTCF) and motifs for known and unknown co-factors of genes associated with the pluripotent state of ES cells. CisFinder also showed robust performance in the identification of motifs that were only slightly enriched in a set of DNA sequences.Keywords
This publication has 49 references indexed in Scilit:
- YY1's longer DNA-binding motifsGenomics, 2009
- Variation in Homeodomain DNA Binding Revealed by High-Resolution Analysis of Sequence PreferencesCell, 2008
- Integration of External Signaling Pathways with the Core Transcriptional Network in Embryonic Stem CellsCell, 2008
- Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencingNature Methods, 2007
- High-Resolution Profiling of Histone Methylations in the Human GenomeCell, 2007
- Systematic discovery of regulatory motifs in conserved regions of the human genome, including thousands of CTCF insulator sitesProceedings of the National Academy of Sciences, 2007
- The Oct4 and Nanog transcription network regulates pluripotency in mouse embryonic stem cellsNature Genetics, 2006
- Systematic discovery of regulatory motifs in human promoters and 3′ UTRs by comparison of several mammalsNature, 2005
- Assessing computational tools for the discovery of transcription factor binding sitesNature Biotechnology, 2005
- An algorithm for finding protein–DNA binding sites with applications to chromatin- immunoprecipitation microarray experimentsNature Biotechnology, 2002