Accurate prediction of stability changes in protein mutants by combining machine learning with structure based computational mutagenesis

Open Access

16 July 2008

journal article
research article
Published by Oxford University Press (OUP) in Bioinformatics

Vol. 24 (18), 2002-2009
https://doi.org/10.1093/bioinformatics/btn353

Abstract

Motivation: Accurate predictive models for the impact of single amino acid substitutions on protein stability provide insight into protein structure and function. Such models are also valuable for the design and engineering of new proteins. Previously described methods have utilized properties of protein sequence or structure to predict the free energy change of mutants due to thermal (ΔΔG) and denaturant (ΔΔG^H2O) denaturations, as well as mutant thermal stability (ΔT_m), through the application of either computational energy-based approaches or machine learning techniques. However, accuracy associated with applying these methods separately is frequently far from optimal. Results: We detail a computational mutagenesis technique based on a four-body, knowledge-based, statistical contact potential. For any mutation due to a single amino acid replacement in a protein, the method provides an empirical normalized measure of the ensuing environmental perturbation occurring at every residue position. A feature vector is generated for the mutant by considering perturbations at the mutated position and it's ordered six nearest neighbors in the 3-dimensional (3D) protein structure. These predictors of stability change are evaluated by applying machine learning tools to large training sets of mutants derived from diverse proteins that have been experimentally studied and described. Predictive models based on our combined approach are either comparable to, or in many cases significantly outperform, previously published results. Availability: A web server with supporting documentation is available at http://proteins.gmu.edu/automute Contact:ivaisman@gmu.edu

Keywords

This publication has 40 references indexed in Scilit:

The network of sequence flow between protein structures
Proceedings of the National Academy of Sciences, 2007
Knowledge acquisition and development of accurate rules for predicting protein stability changes
Computational Biology and Chemistry, 2006
Structural analysis and prediction of protein mutant stability using distance and torsion potentials: Role of secondary structure and solvent accessibility
Proteins-Structure Function and Bioinformatics, 2006
UCSF Chimera—A visualization system for exploratory research and analysis
Journal of Computational Chemistry, 2004
Distance‐scaled, finite ideal‐gas reference state improves structure‐derived potentials of mean force for structure selection and stability prediction
Protein Science, 2002
Predicting Changes in the Stability of Proteins and Protein Complexes: A Study of More Than 1000 Mutations
Journal of Molecular Biology, 2002
The Protein Data Bank
Nucleic Acids Research, 2000
Predicting protein stability changes upon mutation using database-derived potentials: solvent accessibility determines the importance of local versus non-local interactions along the sequence
Journal of Molecular Biology, 1997
Desk-top Analysis of the Structural Stability of Various Point Mutations Introduced into Ribonuclease H
Journal of Molecular Biology, 1995
Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features
Biopolymers, 1983

Cited by 157 articles