Accurate prediction of stability changes in protein mutants by combining machine learning with structure based computational mutagenesis
Open Access
- 16 July 2008
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 24 (18), 2002-2009
- https://doi.org/10.1093/bioinformatics/btn353
Abstract
Motivation: Accurate predictive models for the impact of single amino acid substitutions on protein stability provide insight into protein structure and function. Such models are also valuable for the design and engineering of new proteins. Previously described methods have utilized properties of protein sequence or structure to predict the free energy change of mutants due to thermal (ΔΔG) and denaturant (ΔΔGH2O) denaturations, as well as mutant thermal stability (ΔTm), through the application of either computational energy-based approaches or machine learning techniques. However, accuracy associated with applying these methods separately is frequently far from optimal. Results: We detail a computational mutagenesis technique based on a four-body, knowledge-based, statistical contact potential. For any mutation due to a single amino acid replacement in a protein, the method provides an empirical normalized measure of the ensuing environmental perturbation occurring at every residue position. A feature vector is generated for the mutant by considering perturbations at the mutated position and it's ordered six nearest neighbors in the 3-dimensional (3D) protein structure. These predictors of stability change are evaluated by applying machine learning tools to large training sets of mutants derived from diverse proteins that have been experimentally studied and described. Predictive models based on our combined approach are either comparable to, or in many cases significantly outperform, previously published results. Availability: A web server with supporting documentation is available at http://proteins.gmu.edu/automute Contact:ivaisman@gmu.eduKeywords
This publication has 40 references indexed in Scilit:
- The network of sequence flow between protein structuresProceedings of the National Academy of Sciences, 2007
- Knowledge acquisition and development of accurate rules for predicting protein stability changesComputational Biology and Chemistry, 2006
- Structural analysis and prediction of protein mutant stability using distance and torsion potentials: Role of secondary structure and solvent accessibilityProteins-Structure Function and Bioinformatics, 2006
- UCSF Chimera—A visualization system for exploratory research and analysisJournal of Computational Chemistry, 2004
- Distance‐scaled, finite ideal‐gas reference state improves structure‐derived potentials of mean force for structure selection and stability predictionProtein Science, 2002
- Predicting Changes in the Stability of Proteins and Protein Complexes: A Study of More Than 1000 MutationsJournal of Molecular Biology, 2002
- The Protein Data BankNucleic Acids Research, 2000
- Predicting protein stability changes upon mutation using database-derived potentials: solvent accessibility determines the importance of local versus non-local interactions along the sequenceJournal of Molecular Biology, 1997
- Desk-top Analysis of the Structural Stability of Various Point Mutations Introduced into Ribonuclease HJournal of Molecular Biology, 1995
- Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical featuresBiopolymers, 1983