Inferring Stabilizing Mutations from Protein Phylogenies: Application to Influenza Hemagglutinin

Abstract
One selection pressure shaping sequence evolution is the requirement that a protein fold with sufficient stability to perform its biological functions. We present a conceptual framework that explains how this requirement causes the probability that a particular amino acid mutation is fixed during evolution to depend on its effect on protein stability. We mathematically formalize this framework to develop a Bayesian approach for inferring the stability effects of individual mutations from homologous protein sequences of known phylogeny. This approach is able to predict published experimentally measured mutational stability effects (ΔΔG values) with an accuracy that exceeds both a state-of-the-art physicochemical modeling program and the sequence-based consensus approach. As a further test, we use our phylogenetic inference approach to predict stabilizing mutations to influenza hemagglutinin. We introduce these mutations into a temperature-sensitive influenza virus with a defect in its hemagglutinin gene and experimentally demonstrate that some of the mutations allow the virus to grow at higher temperatures. Our work therefore describes a powerful new approach for predicting stabilizing mutations that can be successfully applied even to large, complex proteins such as hemagglutinin. This approach also makes a mathematical link between phylogenetics and experimentally measurable protein properties, potentially paving the way for more accurate analyses of molecular evolution. Mutating a protein frequently causes a change in its stability. As scientists, we often care about these changes because we would like to engineer a protein's stability or understand how its stability is impacted by a naturally occurring mutation. Evolution also cares about mutational stability changes, because a basic evolutionary requirement is that proteins remain sufficiently stable to perform their biological functions. Our work is based on the idea that it should be possible to use the fact that evolution selects for stability to infer from related proteins the effects of specific mutations. We show that we can indeed use protein evolutionary histories to computationally predict previously measured mutational stability changes more accurately than methods based on either of the two main existing strategies. We then test whether we can predict mutations that increase the stability of hemagglutinin, an influenza protein whose rapid evolution is partly responsible for the ability of this virus to cause yearly epidemics. We experimentally create viruses carrying predicted stabilizing mutations and find that several do in fact improve the virus's ability to grow at higher temperatures. Our computational approach may therefore be of use in understanding the evolution of this medically important virus.