Prediction of protein solvent accessibility using support vector machines
- 25 June 2002
- journal article
- research article
- Published by Wiley in Proteins-Structure Function and Bioinformatics
- Vol. 48 (3), 566-570
- https://doi.org/10.1002/prot.10176
Abstract
A Support Vector Machine learning system has been trained to predict protein solvent accessibility from the primary structure. Different kernel functions and sliding window sizes have been explored to find how they affect the prediction performance. Using a cut‐off threshold of 15% that splits the dataset evenly (an equal number of exposed and buried residues), this method was able to achieve a prediction accuracy of 70.1% for single sequence input and 73.9% for multiple alignment sequence input, respectively. The prediction of three and more states of solvent accessibility was also studied and compared with other methods. The prediction accuracies are better than, or comparable to, those obtained by other methods such as neural networks, Bayesian classification, multiple linear regression, and information theory. In addition, our results further suggest that this system may be combined with other prediction methods to achieve more reliable results, and that the Support Vector Machine method is a very useful tool for biological sequence analysis. Proteins 2002;48:566–570.Keywords
This publication has 32 references indexed in Scilit:
- A novel method of protein secondary structure prediction with high segment overlap measure: support vector machine approach1 1Edited by B. HollandJournal of Molecular Biology, 2001
- A Comparison of Machine Learning Methods for the Diagnosis of Pigmented Skin LesionsJournal of Biomedical Informatics, 2001
- Enhanced genome annotation using structural profiles in the program 3D-PSSM 1 1Edited by J. ThorntonJournal of Molecular Biology, 2000
- Adaptation of protein surfaces to subcellular location 1 1Edited by F. E. CohenJournal of Molecular Biology, 1998
- Identification of functional surfaces of the zinc binding domains of intracellular receptorsJournal of Molecular Biology, 1997
- A 3D-1D substitution matrix for protein fold recognition that includes predicted secondary structure of the sequenceJournal of Molecular Biology, 1997
- Support-vector networksMachine Learning, 1995
- Hybrid system for protein secondary structure predictionJournal of Molecular Biology, 1992
- Predicting surface exposure of amino acids from protein sequenceProtein Engineering, Design and Selection, 1990
- Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical featuresBiopolymers, 1983