Comparative protein structure modeling by iterative alignment, model building and model assessment
Open Access
- 15 July 2003
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 31 (14), 3982-3992
- https://doi.org/10.1093/nar/gkg460
Abstract
Comparative or homology protein structure modeling is severely limited by errors in the alignment of a modeled sequence with related proteins of known three‐dimensional structure. To ameliorate this problem, we have developed an automated method that optimizes both the alignment and the model implied by it. This task is achieved by a genetic algorithm protocol that starts with a set of initial alignments and then iterates through re‐alignment, model building and model assessment to optimize a model assessment score. During this iterative process: (i) new alignments are constructed by application of a number of operators, such as alignment mutations and cross‐overs; (ii) comparative models corresponding to these alignments are built by satisfaction of spatial restraints, as implemented in our program MODELLER; (iii) the models are assessed by a variety of criteria, partly depending on an atomic statistical potential. When testing the procedure on a very difficult set of 19 modeling targets sharing only 4–27% sequence identity with their template structures, the average final alignment accuracy increased from 37 to 45% relative to the initial alignment (the alignment accuracy was measured as the percentage of positions in the tested alignment that were identical to the reference structure‐based alignment). Correspondingly, the average model accuracy increased from 43 to 54% (the model accuracy was measured as the percentage of the Cα atoms of the model that were within 5 Å of the corresponding Cα atoms in the superposed native structure). The present method also compares favorably with two of the most successful previously described methods, PSI‐BLAST and SAM. The accuracy of the final models would be increased further if a better method for ranking of the models were available.Keywords
This publication has 49 references indexed in Scilit:
- The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003Nucleic Acids Research, 2003
- Statistical potentials for fold assessmentProtein Science, 2002
- Three-Dimensional Structure of Rat Surfactant Protein A Trimers in Association with Phospholipid Monolayers,Biochemistry, 2000
- Comparison of sequence profiles. Strategies for structural predictions using sequence informationProtein Science, 2000
- The Protein Data BankNucleic Acids Research, 2000
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- Protein folding simulations with genetic algorithms and a detailed molecular descriptionJournal of Molecular Biology, 1997
- Structural Diversity in a Family of Homologous ProteinsJournal of Molecular Biology, 1996
- Comparative Protein Modelling by Satisfaction of Spatial RestraintsJournal of Molecular Biology, 1993
- A new approach to protein fold recognitionNature, 1992