Abstract
The DNA sequences of three genes--celC, crr, and gutB--have been determined for each of 11 or 12 natural isolates of Escherichia coli from the ECOR collection. These genes encode the phosphoenolpyruvate-dependent phosphotransferase-system enzyme III proteins specific for beta-glucoside sugars (celC), glucose (crr), and glucitol (gutB), respectively. There is little evidence of recombination at or among these loci; among these strains, relationships inferred from each gene are largely consistent with each other and with the relationship inferred from multilocus enzyme electrophoresis. DNA sequence diversity is similar for all three genes, particularly when silent (synonymous) sites only are considered. This is surprising because there is much stronger codon usage bias at crr than at celC or gutB. The extent of divergence in the protein sequences encoded by these three genes varies considerably. The constitutively expressed glucose-specific enzyme is completely conserved. It is surprising that the inducible glucitol-specific enzyme, which is functional, is more variable than the cellobiose-specific enzyme, which is cryptic; the latter might be expected to be under less (if any) purifying selection.