MUSTER: Improving protein sequence profile–profile alignments by using multiple sources of structure information
Top Cited Papers
- 4 February 2008
- journal article
- research article
- Published by Wiley in Proteins-Structure Function and Bioinformatics
- Vol. 72 (2), 547-556
- https://doi.org/10.1002/prot.21945
Abstract
We develop a new threading algorithm MUSTER by extending the previous sequence profile–profile alignment method, PPA. It combines various sequence and structure information into single‐body terms which can be conveniently used in dynamic programming search: (1) sequence profiles; (2) secondary structures; (3) structure fragment profiles; (4) solvent accessibility; (5) dihedral torsion angles; (6) hydrophobic scoring matrix. The balance of the weighting parameters is optimized by a grading search based on the average TM‐score of 111 training proteins which shows a better performance than using the conventional optimization methods based on the PROSUP database. The algorithm is tested on 500 nonhomologous proteins independent of the training sets. After removing the homologous templates with a sequence identity to the target >30%, in 224 cases, the first template alignment has the correct topology with a TM‐score >0.5. Even with a more stringent cutoff by removing the templates with a sequence identity >20% or detectable by PSI‐BLAST with an E‐value 0.5. Dependent on the homology cutoffs, the average TM‐score of the first threading alignments by MUSTER is 5.1–6.3% higher than that by PPA. This improvement is statistically significant by the Wilcoxon signed rank test with a P‐value < 1.0 × 10−13, which demonstrates the effect of additional structural information on the protein fold recognition. The MUSTER server is freely available to the academic community at http://zhang.bioinformatics.ku.edu/MUSTER. Proteins 2008.Keywords
This publication has 67 references indexed in Scilit:
- Ab Initio Protein Structure Prediction Using Chunk-TASSERBiophysical Journal, 2007
- Ab initio modeling of small proteins by iterative TASSER simulationsBMC Biology, 2007
- LOMETS: A local meta-threading-server for protein structure predictionNucleic Acids Research, 2007
- FFAS03: a server for profile-profile sequence alignmentsNucleic Acids Research, 2005
- Single‐body residue‐level knowledge‐based energy score combined with sequence‐profile and secondary structure information for fold recognitionProteins-Structure Function and Bioinformatics, 2004
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- Enlarged representative set of protein structuresProtein Science, 1994
- A new approach to protein fold recognitionNature, 1992
- Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical featuresBiopolymers, 1983
- Prediction of protein antigenic determinants from amino acid sequences.Proceedings of the National Academy of Sciences, 1981