Machine learning approaches for the prediction of signal peptides and other protein sorting signals
Open Access
- 1 January 1999
- journal article
- research article
- Published by Oxford University Press (OUP) in Protein Engineering, Design and Selection
- Vol. 12 (1), 3-9
- https://doi.org/10.1093/protein/12.1.3
Abstract
Prediction of protein sorting signals from the sequence of amino acids has great importance in the field of proteomics today. Recently, the growth of protein databases, combined with machine learning approaches, such as neural networks and hidden Markov models, have made it possible to achieve a level of reliability where practical use in, for example automatic database annotation is feasible. In this review, we concentrate on the present status and future perspectives of SignalP, our neural network-based method for prediction of the most well-known sorting signal: the secretory signal peptide. We discuss the problems associated with the use of SignalP on genomic sequences, showing that signal peptide prediction will improve further if integrated with predictions of start codons and transmembrane helices. As a step towards this goal, a hidden Markov model version of SignalP has been developed, making it possible to discriminate between cleaved signal peptides and uncleaved signal anchors. Furthermore, we show how SignalP can be used to characterize putative signal peptides from an archaeon, Methanococcus jannaschii. Finally, we briefly review a few methods for predicting other protein sorting signals and discuss the future of protein sorting prediction in general.Keywords
This publication has 35 references indexed in Scilit:
- Biological Sequence AnalysisPublished by Cambridge University Press (CUP) ,1998
- Relation between amino acid composition and cellular location of proteinsJournal of Molecular Biology, 1997
- The SWISS-PROT protein sequence data bank and its supplement TrEMBLNucleic Acids Research, 1997
- Computational Method to Predict Mitochondrially Imported Proteins and their Targeting SequencesEuropean Journal of Biochemistry, 1996
- Complete Genome Sequence of the Methanogenic Archaeon, Methanococcus jannaschii Science, 1996
- [27] Local alignment statisticsMethods in Enzymology, 1996
- The laryngeal mask airway in resuscitationResuscitation, 1994
- Doing Sequence Analysis by Inspecting the Order in which Neural Networks LearnPublished by Springer Nature ,1993
- Neural network detects errors in the assignment of mRNA splice sitesNucleic Acids Research, 1990
- Cleaning up gene databasesNature, 1990