Identifying secretomes in people, pufferfish and pigs

Open Access

23 February 2004

journal article
research article
Published by Oxford University Press (OUP) in Nucleic Acids Research

Vol. 32 (4), 1414-1421
https://doi.org/10.1093/nar/gkh286

Abstract

The proteins processed by the secretory pathway (secretome) are critical players in the development of multi‐cellular eukaryotic organisms but have yet to be comprehensively studied at the genomic level. In this study, we use the Target P algorithm to predict human (13–20% of proteins found in individual datasets) and Fugu (14%) secretomes based on analysis of their nearly complete proteomes. We combine internal processing with prediction software to automate secreted protein identification and overcome one of the major challenges associated with EST data: identification of the minority of clones that encode N‐terminally‐complete proteins. We discuss the use of these methods to predict secreted proteins in EST‐based consensus sequence sets, and we validate these predictions using an assay for cell‐free cotranslational translocation. Analysis of TIGR Porcine Gene Index 4.0 as a test dataset resulted in the identification of 352 N‐terminally‐complete, putative secreted proteins. In functional agreement with our predictions, 34 of 40 (85%) of these cDNAs were verified to be cotranslationally translocated in an in vitro translation system. The methods developed here are specifically designed to accept partial open reading frames and improve secreted protein predictions in eukaryotic transcriptomes, and are valuable for the analysis and annotation of eukaryotic EST databases.

Keywords

This publication has 26 references indexed in Scilit:

Target selection for Danio rerio functional genomics
Genesis, 2001
Computational Inference of Homologous Gene Structures in the Human Genome
Genome Research, 2001
The TIGR Gene Indices: analysis of gene transcript sequences in highly sampled eukaryotic species
Nucleic Acids Research, 2001
Effective targeted gene ‘knockdown’ in zebrafish
Nature Genetics, 2000
Signal Peptide-Dependent Protein Transport in Bacillus subtilis : a Genome-Based Survey of the Secretome
Microbiology and Molecular Biology Reviews, 2000
A comparison of signal sequence prediction methods using a test set of signal peptides
Bioinformatics, 2000
Predicting Subcellular Localization of Proteins Based on their N-terminal Amino Acid Sequence
Journal of Molecular Biology, 2000
Localization and Post-Golgi Trafficking of Tumor Necrosis Factor-alpha in Macrophages
Journal of Interferon & Cytokine Research, 2000
Large-scale predictions of secretory proteins from mammalian genomic and EST sequences
Current Opinion in Biotechnology, 2000
Secretion of the galectin family of mammalian carbohydrate-binding proteins
Biochimica et Biophysica Acta (BBA) - General Subjects, 1999

Cited by 38 articles