An algorithm for finding protein–DNA binding sites with applications to chromatin- immunoprecipitation microarray experiments
Top Cited Papers
- 8 July 2002
- journal article
- research article
- Published by Springer Nature in Nature Biotechnology
- Vol. 20 (8), 835-839
- https://doi.org/10.1038/nbt717
Abstract
Chromatin immunoprecipitation followed by cDNA microarray hybridization (ChIP–array) has become a popular procedure for studying genome-wide protein–DNA interactions and transcription regulation. However, it can only map the probable protein–DNA interaction loci within 1–2 kilobases resolution. To pinpoint interaction sites down to the base-pair level, we introduce a computational method, Motif Discovery scan (MDscan), that examines the ChIP–array-selected sequences and searches for DNA sequence motifs representing the protein–DNA interaction sites. MDscan combines the advantages of two widely adopted motif search strategies, word enumeration1,2,3,4 and position-specific weight matrix updating5,6,7,8,9, and incorporates the ChIP–array ranking information to accelerate searches and enhance their success rates. MDscan correctly identified all the experimentally verified motifs from published ChIP–array experiments in yeast10,11,12,13 (STE12, GAL4, RAP1, SCB, MCB, MCM1, SFF, and SWI5), and predicted two motif patterns for the differential binding of Rap1 protein in telomere regions. In our studies, the method was faster and more accurate than several established motif-finding algorithms5,8,9. MDscan can be used to find DNA motifs not only in ChIP–array experiments but also in other experiments in which a subgroup of the sequences can be inferred to contain relatively abundant motif sites. The MDscan web server can be accessed at http://BioProspector.stanford.edu/MDscan/.Keywords
This publication has 17 references indexed in Scilit:
- Serial Regulation of Transcriptional Regulators in the Yeast Cell CycleCell, 2001
- Promoter-specific binding of Rap1 revealed by genome-wide maps of protein–DNA associationNature Genetics, 2001
- Genomic binding sites of the yeast cell-cycle transcription factors SBF and MBFNature, 2001
- Genome-Wide Location and Function of DNA Binding ProteinsScience, 2000
- Building a dictionary for genomes: Identification of presumptive regulatory sites by statistical analysisProceedings of the National Academy of Sciences, 2000
- Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitationNature Biotechnology, 1998
- Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies 1 1Edited by G. von HeijneJournal of Molecular Biology, 1998
- Bayesian Models for Multiple Local Sequence Alignment and Gibbs Sampling StrategiesJournal of the American Statistical Association, 1995
- Identification of consensus patterns in unaligned DNA sequences known to be functionally relatedBioinformatics, 1990
- The yeast STE12 protein binds to the DNA sequence mediating pheromone induction.Proceedings of the National Academy of Sciences, 1989