Estimating Recombinational Parameters in Streptococcus pneumoniae From Multilocus Sequence Typing Data

Abstract
Multilocus sequence typing (MLST) is a highly discriminatory molecular typing method that defines isolates of bacterial pathogens using the sequences of ~450-bp internal fragments of seven housekeeping genes. This technique has been applied to 575 isolates of Streptococcus pneumoniae and identifies a number of discrete clonal complexes. These clonal complexes are typically represented by a single group of isolates sharing identical alleles at all seven loci, plus single-locus variants that differ from this group at only one out of the seven loci. As MLST is highly discriminatory, the members of each clonal complex can be assumed to have a recent common ancestor, and the molecular events that give rise to the single-locus variants can be used to estimate the relative contributions of recombination and mutation to clonal divergence. By comparing the sequences of the variant alleles within each clonal complex with the allele typically found within that clonal complex, we estimate that recombination has generated new alleles at a frequency ~10-fold higher than mutation, and that a single nucleotide site is ~50 times more likely to change through recombination than mutation. We also demonstrate how to estimate the average length of recombinational replacements from MLST data.