iLoc-Euk: A Multi-Label Classifier for Predicting the Subcellular Localization of Singleplex and Multiplex Eukaryotic Proteins
Open Access
- 30 March 2011
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLOS ONE
- Vol. 6 (3), e18258
- https://doi.org/10.1371/journal.pone.0018258
Abstract
Predicting protein subcellular localization is an important and difficult problem, particularly when query proteins may have the multiplex character, i.e., simultaneously exist at, or move between, two or more different subcellular location sites. Most of the existing protein subcellular location predictor can only be used to deal with the single-location or “singleplex” proteins. Actually, multiple-location or “multiplex” proteins should not be ignored because they usually posses some unique biological functions worthy of our special notice. By introducing the “multi-labeled learning” and “accumulation-layer scale”, a new predictor, called iLoc-Euk, has been developed that can be used to deal with the systems containing both singleplex and multiplex proteins. As a demonstration, the jackknife cross-validation was performed with iLoc-Euk on a benchmark dataset of eukaryotic proteins classified into the following 22 location sites: (1) acrosome, (2) cell membrane, (3) cell wall, (4) centriole, (5) chloroplast, (6) cyanelle, (7) cytoplasm, (8) cytoskeleton, (9) endoplasmic reticulum, (10) endosome, (11) extracellular, (12) Golgi apparatus, (13) hydrogenosome, (14) lysosome, (15) melanosome, (16) microsome (17) mitochondrion, (18) nucleus, (19) peroxisome, (20) spindle pole body, (21) synapse, and (22) vacuole, where none of proteins included has pairwise sequence identity to any other in a same subset. The overall success rate thus obtained by iLoc-Euk was 79%, which is significantly higher than that by any of the existing predictors that also have the capacity to deal with such a complicated and stringent system. As a user-friendly web-server, iLoc-Euk is freely accessible to the public at the web-site http://icpr.jci.edu.cn/bioinfo/iLoc-Euk. It is anticipated that iLoc-Euk may become a useful bioinformatics tool for Molecular Cell Biology, Proteomics, System Biology, and Drug Development Also, its novel approach will further stimulate the development of predicting other protein attributes.Keywords
This publication has 70 references indexed in Scilit:
- Knowledge-based computational mutagenesis for predicting the disease potential of human non-synonymous single nucleotide polymorphismsJournal of Theoretical Biology, 2010
- A New Method for Predicting the Subcellular Localization of Eukaryotic Proteins with Both Single and Multiple Sites: Euk-mPLoc 2.0PLOS ONE, 2010
- Exploring the Function-Location Nexus: Using Multiple Lines of Evidence in Defining the Subcellular Location of Plant ProteinsPlant Cell, 2009
- Protein function annotation by homology-based inferenceGenome Biology, 2009
- Using Chou’s pseudo amino acid composition to predict subcellular localization of apoptosis proteins: An approach with immune genetic algorithm-based ensemble classifierPattern Recognition Letters, 2008
- Prediction of apoptosis protein subcellular location using improved hybrid approach and pseudo-amino acid compositionJournal of Theoretical Biology, 2007
- Prediction of protein cellular attributes using pseudo‐amino acid compositionProteins-Structure Function and Bioinformatics, 2001
- Predicting Subcellular Localization of Proteins Based on their N-terminal Amino Acid SequenceJournal of Molecular Biology, 2000
- Discrimination of Intracellular and Extracellular Proteins Using Amino Acid Composition and Residue-pair FrequenciesJournal of Molecular Biology, 1994
- Statistics of local complexity in amino acid sequences and sequence databasesComputers & Chemistry, 1993