Combining phylogenetic motif discovery and motif clustering to predict co-regulated genes
Open Access
- 16 August 2005
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 21 (20), 3832-3839
- https://doi.org/10.1093/bioinformatics/bti628
Abstract
Motivation: We present a sequence-based framework and algorithm PHYLOCLUS for predicting co-regulated genes. In our approach, de novo discovery methods are used to find motifs conserved by evolution and then a Bayesian hierarchical clustering model is used to cluster these motifs, thereby grouping together genes that are putatively co-regulated. Our clustering procedure allows both the number of clusters and the motif width within each cluster to be unknown. Results: We use our framework to predict co-regulated genes in the bacterium Bacillus subtilis using six other closely related bacterial species. Our predicted motifs and gene clusters are validated using several external sources and significant clusters are examined in detail. An extension to the discovery and clustering of two-block motifs can be used for inference about synergistic binding relationships between transcription factors. Availability: Software and Supplementary Materials can be downloaded at http://stat.wharton.upenn.edu/~stjensen/research/phyloclus.html or http://www.fas.harvard.edu/~junliu/phyloclus.html Contact:stjensen@wharton.upenn.eduKeywords
This publication has 15 references indexed in Scilit:
- BioOptimizer: a Bayesian scoring function approach to motif discoveryBioinformatics, 2004
- Computational Discovery of Gene Regulatory Binding Motifs: A Bayesian PerspectiveStatistical Science, 2004
- GeneMerge—post-genomic analysis, data mining, and hypothesis testingBioinformatics, 2003
- The σE Regulon and the Identification of Additional Sporulation Genes in Bacillus subtilisJournal of Molecular Biology, 2003
- Identification of co-regulated genes through Bayesian clustering of predicted regulatory binding sitesNature Biotechnology, 2003
- Functional Analysis of theBacillus subtilisZur RegulonJournal of Bacteriology, 2002
- Factors Influencing the Identification of Transcription Factor Binding Sites by Cross-Species ComparisonGenome Research, 2002
- Phylogenetic footprinting of transcription factor binding sites in proteobacterial genomesNucleic Acids Research, 2001
- Bayes FactorsJournal of the American Statistical Association, 1995
- Subtilist: a relational database for the Bacillus subtilis genomeMicrobiology, 1995