Generalized protein structure prediction based on combination of fold-recognition with de novo folding and evaluation of models
- 26 September 2005
- journal article
- research article
- Published by Wiley in Proteins-Structure Function and Bioinformatics
- Vol. 61 (S7), 84-90
- https://doi.org/10.1002/prot.20723
Abstract
To predict the tertiary structure of full-length sequences of all targets in CASP6, regardless of their potential category (from easy comparative modeling to fold recognition to apparent new folds) we used a novel combination of two very different approaches developed independently in our laboratories, which ranked quite well in different categories in CASP5. First, the GeneSilico metaserver was used to identify domains, predict secondary structure, and generate fold recognition (FR) alignments, which were converted to full-atom models using the “FRankenstein's Monster” approach for comparative modeling (CM) by recombination of protein fragments. Additional models generated “de novo” by fully automated servers were obtained from the CASP website. All these models were evaluated by VERIFY3D, and residues with scores better than 0.2 were used as a source of spatial restraints. Second, a new implementation of the lattice-based protein modeling tool CABS was used to carry out folding guided by the above-mentioned restraints with the Replica Exchange Monte Carlo sampling technique. Decoys generated in the course of simulation were subject to the average linkage hierarchical clustering. For a representative decoy from each cluster, a full-atom model was rebuilt. Finally, five models were selected for submission based on combination of various criteria, including the size, density, and average energy of the corresponding cluster, and the visual evaluation of the full-atom structures and their relationship to the original templates. The combination of FRankenstein and CABS was one of the best-performing algorithms over all categories in CASP6 (it is important to note that our human intervention was very limited, and all steps in our method can be easily automated). We were able to generate a number of very good models, especially in the Comparative Modeling and New Folds categories. Frequently, the best models were closer to the native structure than any of the templates used. The main problem we encountered was in the ranking of the final models (the only step of significant human intervention), due to the insufficient computational power, which precluded the possibility of full-atom refinement and energy-based evaluation. Proteins 2005;Suppl 7:84–90.Keywords
This publication has 39 references indexed in Scilit:
- Assessment of homology-based predictions in CASP5Proteins-Structure Function and Bioinformatics, 2003
- CAFASP3: The third critical assessment of fully automated structure prediction methodsProteins-Structure Function and Bioinformatics, 2003
- CASP5 assessment of fold recognition target predictionsProteins-Structure Function and Bioinformatics, 2003
- A ?FRankenstein's monster? approach to comparative modeling: Merging the finest fragments of Fold-Recognition models and iterative model refinement aided by 3D structure evaluationProteins-Structure Function and Bioinformatics, 2003
- Comparative protein structure modeling by iterative alignment, model building and model assessmentNucleic Acids Research, 2003
- In silico Protein Recombination: Enhancing Template and Sequence Alignment Selection for Comparative Protein ModellingJournal of Molecular Biology, 2003
- 3D‐SHOTGUN: A novel, cooperative, fold‐recognition meta‐predictorProteins-Structure Function and Bioinformatics, 2003
- Pcons: A neural‐network–based consensus predictor that improves fold recognitionProtein Science, 2001
- LiveBench‐1: Continuous benchmarking of protein structure prediction serversProtein Science, 2001
- Comparison of sequence profiles. Strategies for structural predictions using sequence informationProtein Science, 2000