Ab initio modeling of small proteins by iterative TASSER simulations

Top Cited Papers

Open Access

8 May 2007

journal article
research article
Published by Springer Nature in BMC Biology

Vol. 5 (1), 17
https://doi.org/10.1186/1741-7007-5-17

Abstract

Background: Predicting 3-dimensional protein structures from amino-acid sequences is an important unsolved problem in computational structural biology. The problem becomes relatively easier if close homologous proteins have been solved, as high-resolution models can be built by aligning target sequences to the solved homologous structures. However, for sequences without similar folds in the Protein Data Bank (PDB) library, the models have to be predicted from scratch. Progress in the ab initio structure modeling is slow. The aim of this study was to extend the TASSER (threading/assembly/refinement) method for the ab initio modeling and examine systemically its ability to fold small single-domain proteins. Results: We developed I-TASSER by iteratively implementing the TASSER method, which is used in the folding test of three benchmarks of small proteins. First, data on 16 small proteins (< 90 residues) were used to generate I-TASSER models, which had an average C_α-root mean square deviation (RMSD) of 3.8Å, with 6 of them having a C_α-RMSD < 2.5Å. The overall result was comparable with the all-atomic ROSETTA simulation, but the central processing unit (CPU) time by I-TASSER was much shorter (150 CPU days vs. 5 CPU hours). Second, data on 20 small proteins (< 120 residues) were used. I-TASSER folded four of them with a C_α-RMSD < 2.5Å. The average C_α-RMSD of the I-TASSER models was 3.9Å, whereas it was 5.9Å using TOUCHSTONE-II software. Finally, 20 non-homologous small proteins (< 120 residues) were taken from the PDB library. An average C_α-RMSD of 3.9Å was obtained for the third benchmark, with seven cases having a C_α-RMSD < 2.5Å. Conclusion: Our simulation results show that I-TASSER can consistently predict the correct folds and sometimes high-resolution models for small single-domain proteins. Compared with other ab initio modeling methods such as ROSETTA and TOUCHSTONE II, the average performance of I-TASSER is either much better or is similar within a lower computational time. These data, together with the significant performance of automated I-TASSER server (the Zhang-Server) in the 'free modeling' section of the recent Critical Assessment of Structure Prediction (CASP)7 experiment, demonstrate new progresses in automated ab initio model generation. The I-TASSER server is freely available for academic users http://zhang.bioinformatics.ku.edu/I-TASSER.

Keywords

This publication has 44 references indexed in Scilit:

A graph‐theory algorithm for rapid protein side‐chain prediction
Protein Science, 2003
Modeling of loops in protein structures
Protein Science, 2000
The Protein Data Bank
Nucleic Acids Research, 2000
Protein secondary structure prediction based on position-specific scoring matrices 1 1Edited by G. Von Heijne
Journal of Molecular Biology, 1999
Gapped BLAST and PSI-BLAST: a new generation of protein database search programs
Nucleic Acids Research, 1997
Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and bayesian scoring functions
Journal of Molecular Biology, 1997
Comparative Protein Modelling by Satisfaction of Spatial Restraints
Journal of Molecular Biology, 1993
A new approach to protein fold recognition
Nature, 1992
Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features
Biopolymers, 1983
Prediction of protein antigenic determinants from amino acid sequences.
Proceedings of the National Academy of Sciences, 1981

Cited by 448 articles