A Universal Evolutionary Index for Amino Acid Changes

Abstract
Different nonsynonymous changes may be under different selective pressure during evolution. Of the 190 possible interchanges among the 20 amino acids, only 75 can be attained by a single-base substitution. An evolutionary index (EI) can be empirically computed for each of the 75 elementary changes as the likelihood of substitutions, relative to that of synonymous changes. We used 280, 1,306, 2,488, and 309 orthologous genes from primates (human versus Old World monkey), rodents (mouse versus rat), yeast (S. cerevisiae versus S. paradoxus), and Drosophila (D. melanogaster versus D. simulans), respectively, to estimate the EIs. In each data set, EI varies more than 10-fold, and the correlation coefficients of EIs from the pairwise comparisons are high (e.g., r = 0.91 between rodent and yeast). The high correlations suggest that the amino acid properties are strong determinants of protein evolution, irrespective of the identities of the proteins or the taxa of interest. However, these properties are not well captured in conventional measures of amino acid exchangeability. We, therefore, propose a universal index of exchange (U): for any large data set, its EI can be expressed as U*R, where R is the average Ka/Ks for that data set. The codon-based, empirically determined EI (i.e., U*R) makes much better predictions on protein evolution than do previous methods.