Machine learning approaches for the prediction of signal peptides and other protein sorting signals

Open Access

1 January 1999

journal article
research article
Published by Oxford University Press (OUP) in Protein Engineering, Design and Selection

Vol. 12 (1), 3-9
https://doi.org/10.1093/protein/12.1.3

Abstract

Prediction of protein sorting signals from the sequence of amino acids has great importance in the field of proteomics today. Recently, the growth of protein databases, combined with machine learning approaches, such as neural networks and hidden Markov models, have made it possible to achieve a level of reliability where practical use in, for example automatic database annotation is feasible. In this review, we concentrate on the present status and future perspectives of SignalP, our neural network-based method for prediction of the most well-known sorting signal: the secretory signal peptide. We discuss the problems associated with the use of SignalP on genomic sequences, showing that signal peptide prediction will improve further if integrated with predictions of start codons and transmembrane helices. As a step towards this goal, a hidden Markov model version of SignalP has been developed, making it possible to discriminate between cleaved signal peptides and uncleaved signal anchors. Furthermore, we show how SignalP can be used to characterize putative signal peptides from an archaeon, Methanococcus jannaschii. Finally, we briefly review a few methods for predicting other protein sorting signals and discuss the future of protein sorting prediction in general.

Keywords

This publication has 35 references indexed in Scilit:

Biological Sequence Analysis
Published by Cambridge University Press (CUP) ,1998
Relation between amino acid composition and cellular location of proteins
Journal of Molecular Biology, 1997
The SWISS-PROT protein sequence data bank and its supplement TrEMBL
Nucleic Acids Research, 1997
Computational Method to Predict Mitochondrially Imported Proteins and their Targeting Sequences
European Journal of Biochemistry, 1996
Complete Genome Sequence of the Methanogenic Archaeon, Methanococcus jannaschii
Science, 1996
[27] Local alignment statistics
Methods in Enzymology, 1996
The laryngeal mask airway in resuscitation
Resuscitation, 1994
Doing Sequence Analysis by Inspecting the Order in which Neural Networks Learn
Published by Springer Nature ,1993
Neural network detects errors in the assignment of mRNA splice sites
Nucleic Acids Research, 1990
Cleaning up gene databases
Nature, 1990

Cited by 505 articles