Comparative protein structure modeling by iterative alignment, model building and model assessment

Open Access

15 July 2003

journal article
research article
Published by Oxford University Press (OUP) in Nucleic Acids Research

Vol. 31 (14), 3982-3992
https://doi.org/10.1093/nar/gkg460

Abstract

Comparative or homology protein structure modeling is severely limited by errors in the alignment of a modeled sequence with related proteins of known three‐dimensional structure. To ameliorate this problem, we have developed an automated method that optimizes both the alignment and the model implied by it. This task is achieved by a genetic algorithm protocol that starts with a set of initial alignments and then iterates through re‐alignment, model building and model assessment to optimize a model assessment score. During this iterative process: (i) new alignments are constructed by application of a number of operators, such as alignment mutations and cross‐overs; (ii) comparative models corresponding to these alignments are built by satisfaction of spatial restraints, as implemented in our program MODELLER; (iii) the models are assessed by a variety of criteria, partly depending on an atomic statistical potential. When testing the procedure on a very difficult set of 19 modeling targets sharing only 4–27% sequence identity with their template structures, the average final alignment accuracy increased from 37 to 45% relative to the initial alignment (the alignment accuracy was measured as the percentage of positions in the tested alignment that were identical to the reference structure‐based alignment). Correspondingly, the average model accuracy increased from 43 to 54% (the model accuracy was measured as the percentage of the Cα atoms of the model that were within 5 Å of the corresponding Cα atoms in the superposed native structure). The present method also compares favorably with two of the most successful previously described methods, PSI‐BLAST and SAM. The accuracy of the final models would be increased further if a better method for ranking of the models were available.

Keywords

This publication has 49 references indexed in Scilit:

The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003
Nucleic Acids Research, 2003
Statistical potentials for fold assessment
Protein Science, 2002
Three-Dimensional Structure of Rat Surfactant Protein A Trimers in Association with Phospholipid Monolayers^,
Biochemistry, 2000
Comparison of sequence profiles. Strategies for structural predictions using sequence information
Protein Science, 2000
The Protein Data Bank
Nucleic Acids Research, 2000
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
Nucleic Acids Research, 1997
Protein folding simulations with genetic algorithms and a detailed molecular description
Journal of Molecular Biology, 1997
Structural Diversity in a Family of Homologous Proteins
Journal of Molecular Biology, 1996
Comparative Protein Modelling by Satisfaction of Spatial Restraints
Journal of Molecular Biology, 1993
A new approach to protein fold recognition
Nature, 1992

Cited by 273 articles