An algorithm for finding protein–DNA binding sites with applications to chromatin- immunoprecipitation microarray experiments

Top Cited Papers

8 July 2002

journal article
research article
Published by Springer Nature in Nature Biotechnology

Vol. 20 (8), 835-839
https://doi.org/10.1038/nbt717

Abstract

Chromatin immunoprecipitation followed by cDNA microarray hybridization (ChIP–array) has become a popular procedure for studying genome-wide protein–DNA interactions and transcription regulation. However, it can only map the probable protein–DNA interaction loci within 1–2 kilobases resolution. To pinpoint interaction sites down to the base-pair level, we introduce a computational method, Motif Discovery scan (MDscan), that examines the ChIP–array-selected sequences and searches for DNA sequence motifs representing the protein–DNA interaction sites. MDscan combines the advantages of two widely adopted motif search strategies, word enumeration^1,2,3,4 and position-specific weight matrix updating^5,6,7,8,9, and incorporates the ChIP–array ranking information to accelerate searches and enhance their success rates. MDscan correctly identified all the experimentally verified motifs from published ChIP–array experiments in yeast^10,11,12,13 (STE12, GAL4, RAP1, SCB, MCB, MCM1, SFF, and SWI5), and predicted two motif patterns for the differential binding of Rap1 protein in telomere regions. In our studies, the method was faster and more accurate than several established motif-finding algorithms^5,8,9. MDscan can be used to find DNA motifs not only in ChIP–array experiments but also in other experiments in which a subgroup of the sequences can be inferred to contain relatively abundant motif sites. The MDscan web server can be accessed at http://BioProspector.stanford.edu/MDscan/.

Keywords

This publication has 17 references indexed in Scilit:

Serial Regulation of Transcriptional Regulators in the Yeast Cell Cycle
Cell, 2001
Promoter-specific binding of Rap1 revealed by genome-wide maps of protein–DNA association
Nature Genetics, 2001
Genomic binding sites of the yeast cell-cycle transcription factors SBF and MBF
Nature, 2001
Genome-Wide Location and Function of DNA Binding Proteins
Science, 2000
Building a dictionary for genomes: Identification of presumptive regulatory sites by statistical analysis
Proceedings of the National Academy of Sciences, 2000
Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation
Nature Biotechnology, 1998
Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies 1 1Edited by G. von Heijne
Journal of Molecular Biology, 1998
Bayesian Models for Multiple Local Sequence Alignment and Gibbs Sampling Strategies
Journal of the American Statistical Association, 1995
Identification of consensus patterns in unaligned DNA sequences known to be functionally related
Bioinformatics, 1990
The yeast STE12 protein binds to the DNA sequence mediating pheromone induction.
Proceedings of the National Academy of Sciences, 1989

Cited by 594 articles