Predicting subcellular localization of proteins for Gram‐negative bacteria by support vector machines based on n‐peptide compositions
Top Cited Papers
- 1 May 2004
- journal article
- research article
- Published by Wiley in Protein Science
- Vol. 13 (5), 1402-1406
- https://doi.org/10.1110/ps.03479604
Abstract
Gram‐negative bacteria have five major subcellular localization sites: the cytoplasm, the periplasm, the inner membrane, the outer membrane, and the extracellular space. The subcellular location of a protein can provide valuable information about its function. With the rapid increase of sequenced genomic data, the need for an automated and accurate tool to predict subcellular localization becomes increasingly important. We present an approach to predict subcellular localization for Gram‐negative bacteria. This method uses the support vector machines trained by multiple feature vectors based on n‐peptide compositions. For a standard data set comprising 1443 proteins, the overall prediction accuracy reaches 89%, which, to the best of our knowledge, is the highest prediction rate ever reported. Our prediction is 14% higher than that of the recently developed multimodular PSORT‐B. Because of its simplicity, this approach can be easily extended to other organisms and should be a useful tool for the high‐throughput and large‐scale analysis of proteomic and genomic data.Keywords
This publication has 26 references indexed in Scilit:
- Fine-grained protein fold assignment by support vector machines using generalized npeptide coding schemes and jury voting from multiple-parameter setsProteins-Structure Function and Bioinformatics, 2003
- Using Functional Domain Composition and Support Vector Machines for Prediction of Protein Subcellular LocationJournal of Biological Chemistry, 2002
- Prediction of Human Protein Function from Post-translational Modifications and Localization FeaturesJournal of Molecular Biology, 2002
- Prediction of protein cellular attributes using pseudo‐amino acid compositionProteins-Structure Function and Bioinformatics, 2001
- Predicting Subcellular Localization of Proteins Based on their N-terminal Amino Acid SequenceJournal of Molecular Biology, 2000
- ChloroP, a neural network‐based method for predicting chloroplast transit peptides and their cleavage sitesProtein Science, 1999
- Principles governing amino acid composition of integral membrane proteins: application to topology prediction 1 1Edited by J. ThorntonJournal of Molecular Biology, 1998
- Adaptation of protein surfaces to subcellular location 1 1Edited by F. E. CohenJournal of Molecular Biology, 1998
- Relation between amino acid composition and cellular location of proteinsJournal of Molecular Biology, 1997
- Motif identification neural design for rapid and sensitive protein family searchBioinformatics, 1996