Frequent-subsequence-based prediction of outer membrane proteins
- 24 August 2003
- conference paper
- Published by Association for Computing Machinery (ACM)
- p. 436-445
- https://doi.org/10.1145/956750.956800
Abstract
A number of medically important disease-causing bacteria (collectively called Gram-negative bacteria) are noted for the extra "outer" membrane that surrounds their cell. Proteins resident in this membrane (outer membrane proteins, or OMPs) are of primary research interest for antibiotic and vaccine drug design as they are on the surface of the bacteria and so are the most accessible targets to develop new drugs against. With the development of genome sequencing technology and bioinformatics, biologists can now deduce all the proteins that are likely produced in a given bacteria and have attempted to classify where proteins are located in a bacterial cell. However such protein localization programs are currently least accurate when predicting OMPs, and so there is a current need for the development of a better OMP classifier. Data mining research suggests that the use of frequent patterns has good performance in aiding the development of accurate and efficient classification algorithms. In this paper, we present two methods to identify OMPs based on frequent subsequences and test them on all Gram-negative bacterial proteins whose localizations have been determined by biological experiments. One classifier follows an association rule approach, while the other is based on support vector machines (SVMs). We compare the proposed methods with the state-of-the-art methods in the biological domain. The results demonstrate that our methods are better both in terms of accurately identifying OMPs and providing biological insights that increase our understanding of the structures and functions of these important proteins.Keywords
This publication has 19 references indexed in Scilit:
- The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003Nucleic Acids Research, 2003
- The β‐barrel finder (BBF) program, allowing identification of outer membrane β‐barrel proteins encoded within prokaryotic genomesProtein Science, 2002
- A sequence-profile-based HMM for predicting and discriminating β barrel membrane proteinsBioinformatics, 2002
- Toward genomic identification of β‐barrel membrane proteins: Composition and architecture of known structuresProtein Science, 2002
- Prediction of the transmembrane regions of β‐barrel membrane proteins with a neural network‐based predictorProtein Science, 2001
- β-Barrel membrane proteinsCurrent Opinion in Structural Biology, 2000
- Prediction by a neural network of outer membrane β‐strand protein topologyProtein Science, 1998
- Discrimination of Intracellular and Extracellular Proteins Using Amino Acid Composition and Residue-pair FrequenciesJournal of Molecular Biology, 1994
- Prediction of membrane‐spanning β‐strands and its application to maltoporinProtein Science, 1993
- Fast parallel and serial approximate string matchingJournal of Algorithms, 1989