Conservation of DNA Regulatory Motifs and Discovery of New Motifs in Microbial Genomes

Abstract
Regulatory motifs can be found by local multiple alignment of upstream regions from coregulated sets of genes, or regulons. We searched for regulatory motifs using the program AlignACE together with a set of filters that helped us choose the motifs most likely to be biologically relevant in 17 complete microbial genomes. We searched the upstream regions of potentially coregulated genes grouped by three methods: (1) genes that make up functional pathways; (2) genes homologous to regulons from a well-studied species (Escherichia coli); and (3) groups of genes derived from conserved operons. This last group is based on the observation that genes making up homologous regulons in different species are often assorted into coregulated operons in different combinations. This allows partial reconstruction of regulons by looking at operon structure across several species. Unlike other methods for predicting regulons, this method does not depend on the availability of experimental data other than the genome sequence and the locations of genes. New, statistically significant motifs were found in the genome sequence of each organism using each grouping method. The most significant new motif was found upstream of genes in the methane-metabolism functional group inMethanobacterium thermoautotrophicum. We found that at least 27% of the known E. coli DNA-regulatory motifs are conserved in one or more distantly related eubacteria. We also observed significant motifs that differed from the E. coli motif in other organisms upstream of sets of genes homologous to known E. coli regulons, including Crp, LexA, and ArcA in Bacillus subtilis; four anaerobic regulons in Archaeoglobus fulgidus (NarL, NarP, Fnr, and ModE); and the PhoB, PurR, RpoH, and FhlA regulons in other archaebacterial species. We also used motif conservation to aid in finding new motifs by grouping upstream regions from closely related bacteria, thus increasing the number of instances of the motif in the sequence to be aligned. For example, by grouping upstream sequences from three archaebacterial species, we found a conserved motif that may regulate ferrous ion transport that was not found in individual genomes. Discovery of conserved motifs becomes easier as the number of closely related genome sequences increases.