Toward a More Robust Assessment of IntraspeciesDiversity, Using Fewer GeneticMarkers
Open Access
- 1 November 2006
- journal article
- research article
- Published by American Society for Microbiology in Applied and Environmental Microbiology
- Vol. 72 (11), 7286-7293
- https://doi.org/10.1128/aem.01398-06
Abstract
Phylogenetic sequence analysis of single or multiple genes has dominated the study and census of the genetic diversity among closely related bacteria. It remains unclear, however, how the results based on a few genes in the genome correlate with whole-genome-based relatedness and what genes (if any) best reflect whole-genome-level relatedness and hence should be preferentially used to economize on cost and to improve accuracy. We show here that phylogenies of closely related organisms based on the average nucleotide identity (ANI) of their shared genes correspond accurately to phylogenies based on state-of-the-art analysis of their whole-genome sequences. We use ANI to evaluate the phylogenetic robustness of every gene in the genome and show that almost all core genes, regardless of their functions and positions in the genome, offer robust phylogenetic reconstruction among strains that show 80 to 95% ANI (16S rRNA identity, >98.5%). Lack of elapsed time and, to a lesser extent, horizontal transfer and recombination make the selection of genes more critical for applications that target the intraspecies level, i.e., strains that show >95% ANI according to current standards. A much more accurate phylogeny for the Escherichia coli group was obtained based on just three best-performing genes according to our analysis compared to the concatenated alignment of eight genes that are commonly employed for phylogenetic purposes in this group. Our results are reproducible within the Salmonella, Burkholderia, and Shewanella groups and therefore are expected to have general applicability for microevolution studies, including metagenomic surveys.Keywords
This publication has 22 references indexed in Scilit:
- Toward Automatic Reconstruction of a Highly Resolved Tree of LifeScience, 2006
- Re-evaluating prokaryotic speciesNature Reviews Microbiology, 2005
- Multilocus Sequence Typing versus Pulsed-Field Gel Electrophoresis for Characterization of Extended-Spectrum Beta-Lactamase-Producing Escherichia coli IsolatesJournal of Clinical Microbiology, 2005
- Evolutionary Genetics of a New Pathogenic Escherichia Species: Escherichia albertii and Related Shigella boydii StrainsJournal of Bacteriology, 2005
- Small change: keeping pace with microevolutionNature Reviews Microbiology, 2004
- Environmental Genome Shotgun Sequencing of the Sargasso SeaScience, 2004
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choiceNucleic Acids Research, 1994
- Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in hominoideaJournal of Molecular Evolution, 1989
- Report of the Ad Hoc Committee on Reconciliation of Approaches to Bacterial SystematicsInternational Journal of Systematic and Evolutionary Microbiology, 1987