Towards a Genome-Based Taxonomy for Prokaryotes

Top Cited Papers
Open Access
Abstract
The ranks higher than the species in the prokaryotic taxonomy are primarily designated based on phylogenetic analysis of the 16S rRNA gene sequences, but no definite standards exist for the absolute relatedness (measured by 16S rRNA or other means) between the ranks. Accordingly, it remains unknown how comparable the ranks are between different organisms. To gain insights into this question, we studied the relationship between shared gene content and genetic relatedness for 175 fully sequenced strains, using as a robust measure of relatedness the average amino acid identity (AAI) of the shared genes. Our results reveal that adjacent ranks (e.g., phylum versus class) frequently show extensive overlap in terms of genetic and gene content relatedness of the grouped organisms, and hence, the current system is of limited predictive power in this respect. The overlap between nonadjacent ranks (e.g., phylum versus family) is generally limited and attributable to clear inconsistencies of the taxonomy. In addition to providing means for standardizing taxonomy, our AAI-based approach provides a means to evaluate the robustness of alternative genetic markers for phylogenetic purposes. For instance, the 23S rRNA gene was found to be as good a marker as the 16S rRNA gene, while several of the widely distributed protein-coding genes, such as the RNA polymerase and gyrase subunits, show a strong phylogenetic signal, albeit less strong than the rRNA genes (0.78 > R2 > 0.69 for the protein-coding genes versus R2 = 0.84 for the rRNA genes). The AAI approach outlined here could contribute significantly to a genome-based taxonomy for all microbial organisms.