Abstract
Motivation: Accurate predictive models for the impact of single amino acid substitutions on protein stability provide insight into protein structure and function. Such models are also valuable for the design and engineering of new proteins. Previously described methods have utilized properties of protein sequence or structure to predict the free energy change of mutants due to thermal (ΔΔG) and denaturant (ΔΔGH2O) denaturations, as well as mutant thermal stability (ΔTm), through the application of either computational energy-based approaches or machine learning techniques. However, accuracy associated with applying these methods separately is frequently far from optimal. Results: We detail a computational mutagenesis technique based on a four-body, knowledge-based, statistical contact potential. For any mutation due to a single amino acid replacement in a protein, the method provides an empirical normalized measure of the ensuing environmental perturbation occurring at every residue position. A feature vector is generated for the mutant by considering perturbations at the mutated position and it's ordered six nearest neighbors in the 3-dimensional (3D) protein structure. These predictors of stability change are evaluated by applying machine learning tools to large training sets of mutants derived from diverse proteins that have been experimentally studied and described. Predictive models based on our combined approach are either comparable to, or in many cases significantly outperform, previously published results. Availability: A web server with supporting documentation is available at http://proteins.gmu.edu/automute Contact:ivaisman@gmu.edu