Differential Use of Signal Peptides and Membrane Domains Is a Common Occurrence in the Protein Output of Transcriptional Units
Open Access
- 28 April 2006
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLoS Genetics
- Vol. 2 (4), e46
- https://doi.org/10.1371/journal.pgen.0020046
Abstract
Membrane organization describes the orientation of a protein with respect to the membrane and can be determined by the presence, or absence, and organization within the protein sequence of two features: endoplasmic reticulum signal peptides and alpha-helical transmembrane domains. These features allow protein sequences to be classified into one of five membrane organization categories: soluble intracellular proteins, soluble secreted proteins, type I membrane proteins, type II membrane proteins, and multi-spanning membrane proteins. Generation of protein isoforms with variable membrane organizations can change a protein's subcellular localization or association with the membrane. Application of MemO, a membrane organization annotation pipeline, to the FANTOM3 Isoform Protein Sequence mouse protein set revealed that within the 8,032 transcriptional units (TUs) with multiple protein isoforms, 573 had variation in their use of signal peptides, 1,527 had variation in their use of transmembrane domains, and 615 generated protein isoforms from distinct membrane organization classes. The mechanisms underlying these transcript variations were analyzed. While TUs were identified encoding all pairwise combinations of membrane organization categories, the most common was conversion of membrane proteins to soluble proteins. Observed within our high-confidence set were 156 TUs predicted to generate both extracellular soluble and membrane proteins, and 217 TUs generating both intracellular soluble and membrane proteins. The differential use of endoplasmic reticulum signal peptides and transmembrane domains is a common occurrence within the variable protein output of TUs. The generation of protein isoforms that are targeted to multiple subcellular locations represents a major functional consequence of transcript variation within the mouse transcriptome. Many genes produce only a single protein; however, others are known to produce a number of proteins with different functions in the cell. The function of a protein within the cell is influenced by its location; for example, proteins that are secreted can act as messengers, whereas proteins embedded in the membrane may act as receptors or channels. Features that determine the eventual location of a protein are found in the protein sequence. The authors identified two such features, the signal peptide that targets a protein for secretion, and the transmembrane domain that embeds a protein in the membrane, predicting their occurrence in mouse protein sequences. The authors then searched the entire mouse genome for genes that vary in the use of these features in protein isoforms. They found a large number of genes that produce proteins with variation in these features; for example, they identified genes producing proteins that are both secreted and intracellular, and genes producing proteins that are both membrane bound and soluble. This process is likely to be a major source of functional variation in the output of mammalian genes.Keywords
This publication has 59 references indexed in Scilit:
- Understanding alternative splicing: towards a cellular codeNature Reviews Molecular Cell Biology, 2005
- Function of alternative splicingGene, 2004
- SVMtm: Support vector machines to predict transmembrane segmentsJournal of Computational Chemistry, 2004
- Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAsNature, 2002
- CASP, the Alternatively Spliced Product of the Gene Encoding the CCAAT-Displacement Protein Transcription Factor, Is a Golgi Membrane Protein Related to GiantinMolecular Biology of the Cell, 2002
- Predicting transmembrane protein topology with a hidden markov model: application to complete genomes11Edited by F. CohenJournal of Molecular Biology, 2001
- A Novel Candidate Gene for Mouse and Human Preaxial Polydactyly with Altered Expression in Limbs of Hemimelic extra-toes Mutant MiceGenomics, 2000
- Altered interleukin‐2 receptor α‐chain is expressed in human T‐cell leukaemia virus type‐I‐infected T‐cell lines and human peripheral blood mononuclear cells of adult T‐cell leukaemia patients through an alternative splicing mechanismImmunology, 1997
- A mRNA variant encoding a soluble form of 4-1BB, a member of the murine NGF/TNF receptor familyGene, 1995
- Cystic fibrosis transmembrane conductance regulator splice variants are not conserved and fail to produce chloride channelsNature Genetics, 1993