Statistical potentials for fold assessment
Top Cited Papers
Open Access
- 1 February 2002
- journal article
- research article
- Published by Wiley in Protein Science
- Vol. 11 (2), 430-448
- https://doi.org/10.1002/pro.110430
Abstract
A protein structure model generally needs to be evaluated to assess whether or not it has the correct fold. To improve fold assessment, four types of a residue‐level statistical potential were optimized, including distance‐dependent, contact, ϕ/Ψ dihedral angle, and accessible surface statistical potentials. Approximately 10,000 test models with the correct and incorrect folds were built by automated comparative modeling of protein sequences of known structure. The criterion used to discriminate between the correct and incorrect models was the Z‐score of the model energy. The performance of a Z‐score was determined as a function of many variables in the derivation and use of the corresponding statistical potential. The performance was measured by the fractions of the correctly and incorrectly assessed test models. The most discriminating combination of any one of the four tested potentials is the sum of the normalized distance‐dependent and accessible surface potentials. The distance‐dependent potential that is optimal for assessing models of all sizes uses both Cα and Cβ atoms as interaction centers, distinguishes between all 20 standard residue types, has the distance range of 30 Å, and is derived and used by taking into account the sequence separation of the interacting atom pairs. The terms for the sequentially local interactions are significantly less informative than those for the sequentially nonlocal interactions. The accessible surface potential that is optimal for assessing models of all sizes uses Cβ atoms as interaction centers and distinguishes between all 20 standard residue types. The performance of the tested statistical potentials is not likely to improve significantly with an increase in the number of known protein structures used in their derivation. The parameters of fold assessment whose optimal values vary significantly with model size include the size of the known protein structures used to derive the potential and the distance range of the accessible surface potential. Fold assessment by statistical potentials is most difficult for the very small models. This difficulty presents a challenge to fold assessment in large‐scale comparative modeling, which produces many small and incomplete models. The results described in this study provide a basis for an optimal use of statistical potentials in fold assessment.Keywords
This publication has 121 references indexed in Scilit:
- Ab initio construction of protein tertiary structures using a hierarchical approachJournal of Molecular Biology, 2000
- Modeling of loops in protein structuresProtein Science, 2000
- The Protein Data BankNucleic Acids Research, 2000
- Pair potentials for protein folding: Choice of reference states and sensitivity of predicted native states to variations in the interaction schemesProtein Science, 1999
- CATH – a hierarchic classification of protein domain structuresStructure, 1997
- A new approach to protein fold recognitionNature, 1992
- Assessment of protein models with three-dimensional profilesNature, 1992
- Protein folding: Effect of packing density on chain conformationJournal of Molecular Biology, 1991
- Knowledge-based prediction of protein structures and the design of novel moleculesNature, 1987
- Solvation energy in protein folding and bindingNature, 1986