Predicting subcellular localization of proteins using machine-learned classifiers

Open Access

22 January 2004

journal article
research article
Published by Oxford University Press (OUP) in Bioinformatics

Vol. 20 (4), 547-556
https://doi.org/10.1093/bioinformatics/btg447

Abstract

Motivation: Identifying the destination or localization of proteins is key to understanding their function and facilitating their purification. A number of existing computational prediction methods are based on sequence analysis. However, these methods are limited in scope, accuracy and most particularly breadth of coverage. Rather than using sequence information alone, we have explored the use of database text annotations from homologs and machine learning to substantially improve the prediction of subcellular location. Results: We have constructed five machine-learning classifiers for predicting subcellular localization of proteins from animals, plants, fungi, Gram-negative bacteria and Gram-positive bacteria, which are 81% accurate for fungi and 92–94% accurate for the other four categories. These are the most accurate subcellular predictors across the widest set of organisms ever published. Our predictors are part of the Proteome Analyst web-service. Availability:http://www.cs.ualberta.ca/~bioinfo/PA/Sub, http://www.cs.ualberta.ca/~bioinfo/PA Supplementary information:http://www.cs.ualberta.ca/~bioinfo/PA/Subcellular

Keywords

This publication has 5 references indexed in Scilit:

PSORT-B: improving protein subcellular localization prediction for Gram-negative bacteria
Nucleic Acids Research, 2003
Predicting Protein Cellular Localization Using a Domain Projection Method
Genome Research, 2002
Predicting transmembrane protein topology with a hidden markov model: application to complete genomes11Edited by F. Cohen
Journal of Molecular Biology, 2001
SMART: a web-based tool for the study of genetically mobile domains
Nucleic Acids Research, 2000
Wrappers for feature subset selection
Artificial Intelligence, 1997

Cited by 296 articles