Generalized protein structure prediction based on combination of fold-recognition with de novo folding and evaluation of models

26 September 2005

journal article
research article
Published by Wiley in Proteins-Structure Function and Bioinformatics

Vol. 61 (S7), 84-90
https://doi.org/10.1002/prot.20723

Abstract

To predict the tertiary structure of full-length sequences of all targets in CASP6, regardless of their potential category (from easy comparative modeling to fold recognition to apparent new folds) we used a novel combination of two very different approaches developed independently in our laboratories, which ranked quite well in different categories in CASP5. First, the GeneSilico metaserver was used to identify domains, predict secondary structure, and generate fold recognition (FR) alignments, which were converted to full-atom models using the “FRankenstein's Monster” approach for comparative modeling (CM) by recombination of protein fragments. Additional models generated “de novo” by fully automated servers were obtained from the CASP website. All these models were evaluated by VERIFY3D, and residues with scores better than 0.2 were used as a source of spatial restraints. Second, a new implementation of the lattice-based protein modeling tool CABS was used to carry out folding guided by the above-mentioned restraints with the Replica Exchange Monte Carlo sampling technique. Decoys generated in the course of simulation were subject to the average linkage hierarchical clustering. For a representative decoy from each cluster, a full-atom model was rebuilt. Finally, five models were selected for submission based on combination of various criteria, including the size, density, and average energy of the corresponding cluster, and the visual evaluation of the full-atom structures and their relationship to the original templates. The combination of FRankenstein and CABS was one of the best-performing algorithms over all categories in CASP6 (it is important to note that our human intervention was very limited, and all steps in our method can be easily automated). We were able to generate a number of very good models, especially in the Comparative Modeling and New Folds categories. Frequently, the best models were closer to the native structure than any of the templates used. The main problem we encountered was in the ranking of the final models (the only step of significant human intervention), due to the insufficient computational power, which precluded the possibility of full-atom refinement and energy-based evaluation. Proteins 2005;Suppl 7:84–90.

Keywords

This publication has 39 references indexed in Scilit:

Assessment of homology-based predictions in CASP5
Proteins-Structure Function and Bioinformatics, 2003
CAFASP3: The third critical assessment of fully automated structure prediction methods
Proteins-Structure Function and Bioinformatics, 2003
CASP5 assessment of fold recognition target predictions
Proteins-Structure Function and Bioinformatics, 2003
A ?FRankenstein's monster? approach to comparative modeling: Merging the finest fragments of Fold-Recognition models and iterative model refinement aided by 3D structure evaluation
Proteins-Structure Function and Bioinformatics, 2003
Comparative protein structure modeling by iterative alignment, model building and model assessment
Nucleic Acids Research, 2003
In silico Protein Recombination: Enhancing Template and Sequence Alignment Selection for Comparative Protein Modelling
Journal of Molecular Biology, 2003
3D‐SHOTGUN: A novel, cooperative, fold‐recognition meta‐predictor
Proteins-Structure Function and Bioinformatics, 2003
Pcons: A neural‐network–based consensus predictor that improves fold recognition
Protein Science, 2001
LiveBench‐1: Continuous benchmarking of protein structure prediction servers
Protein Science, 2001
Comparison of sequence profiles. Strategies for structural predictions using sequence information
Protein Science, 2000

Cited by 93 articles