High-throughput SELEX–SAGE method for quantitative modeling of transcription-factor binding sites
- 8 July 2002
- journal article
- research article
- Published by Springer Nature in Nature Biotechnology
- Vol. 20 (8), 831-835
- https://doi.org/10.1038/nbt718
Abstract
The ability to determine the location and relative strength of all transcription-factor binding sites in a genome is important both for a comprehensive understanding of gene regulation and for effective promoter engineering in biotechnological applications. Here we present a bioinformatically driven experimental method to accurately define the DNA-binding sequence specificity of transcription factors. A generalized profile1 was used as a predictive quantitative model for binding sites, and its parameters were estimated from in vitro–selected ligands using standard hidden Markov model training algorithms2,3. Computer simulations showed that several thousand low- to medium-affinity sequences are required to generate a profile of desired accuracy. To produce data on this scale, we applied high-throughput genomics methods to the biochemical problem addressed here. A method combining systematic evolution of ligands by exponential enrichment (SELEX)4 and serial analysis of gene expression (SAGE)5 protocols was coupled to an automated quality-controlled sequence extraction procedure based on Phred quality scores6. This allowed the sequencing of a database of more than 10,000 potential DNA ligands for the CTF/NFI transcription factor. The resulting binding-site model defines the sequence specificity of this protein with a high degree of accuracy not achieved earlier and thereby makes it possible to identify previously unknown regulatory sequences in genomic DNA. A covariance analysis of the selected sites revealed non-independent base preferences at different nucleotide positions, providing insight into the binding mechanism.Keywords
This publication has 16 references indexed in Scilit:
- Non-independence of Mnt repressor-operator interaction determined by a new quantitative multiple fluorescence relative affinity (QuMFRA) assayNucleic Acids Research, 2001
- DNA Binding Specificity of Different STAT ProteinsJournal of Biological Chemistry, 2001
- Experimental analysis and computer prediction of CTF/NFI transcription factor DNA binding sites 1 1Edited by M. YanivJournal of Molecular Biology, 2000
- The mathematics of SELEX against complex targetsJournal of Molecular Biology, 1998
- Prediction of complete gene structures in human genomic DNAJournal of Molecular Biology, 1997
- A flexible motif search technique based on generalized profilesComputers & Chemistry, 1996
- Serial Analysis of Gene ExpressionScience, 1995
- All you wanted to know about SELEXMolecular Biology Reports, 1994
- A quantitative analysis of nuclear factor I/DNA interactionsNucleic Acids Research, 1988
- Selection of DNA binding sites by regulatory proteinsJournal of Molecular Biology, 1987