A Comparative Genomics Approach to Prediction of New Members of Regulons
Open Access
- 1 April 2001
- journal article
- research article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 11 (4), 566-584
- https://doi.org/10.1101/gr.149301
Abstract
Identifying the complete transcriptional regulatory network for an organism is a major challenge. For each regulatory protein, we want to know all the genes it regulates, that is, its regulon. Examples of known binding sites can be used to estimate the binding specificity of the protein and to predict other binding sites. However, binding site predictions can be unreliable because determining the true specificity of the protein is difficult because of the considerable variability of binding sites. Because regulatory systems tend to be conserved through evolution, we can use comparisons between species to increase the reliability of binding site predictions. In this article, an approach is presented to evaluate the computational predicitions of regulatory sites. We combine the prediction of transcription units having orthologous genes with the prediction of transcription factor binding sites based on probabilistic models. We augment the sets of genes inEscherichia coli that are expected to be regulated by two transcription factors, the cAMP receptor protein and the fumarate and nitrate reduction regulatory protein, through a comparison with theHaemophilus influenzae genome. At the same time, we learned more about the regulatory networks of H. influenzae, a species with much less experimental knowledge than E. coli. By studying orthologous genes subject to regulation by the same transcription factor, we also gained understanding of the evolution of the entire regulatory systems.Keywords
This publication has 44 references indexed in Scilit:
- Modeling and predicting transcriptional units of Escherichia coligenes using hidden Markov modelsBioinformatics, 1999
- The Complete Genome Sequence of Escherichia coli K-12Science, 1997
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- Transcriptional activation by FNR and CRP: reciprocity of binding‐site recognitionMolecular Microbiology, 1997
- Structure of the CAP-DNA Complex at 2.5 Å Resolution: A Complete Picture of the Protein-DNA InterfaceJournal of Molecular Biology, 1996
- Novel FNR homologues identified in four representative oral facultative anaerobes:Capnocytophaga ochracea, Capnocytophaga sputigena, Haemophilus aphrophilus, andActinobacillus actinomycetemcomitansFEMS Microbiology Letters, 1996
- Metabolism and evolution of Haemophilus influenzae deduced from a whole-genome comparison with Escherichia coliCurrent Biology, 1996
- Rules for Coupled Expression of Regulator and Effector Genes in Inducible CircuitsJournal of Molecular Biology, 1996
- Prediction of Function in DNA Sequence AnalysisJournal of Computational Biology, 1995
- Sequence logos: a new way to display consensus sequencesNucleic Acids Research, 1990