Predicting Protein Complexes from PPI Data: A Core-Attachment Approach
- 1 February 2009
- journal article
- research article
- Published by Mary Ann Liebert Inc in Journal of Computational Biology
- Vol. 16 (2), 133-144
- https://doi.org/10.1089/cmb.2008.01tt
Abstract
Protein complexes play a critical role in many biological processes. Identifying the component proteins in a protein complex is an important step in understanding the complex as well as the related biological activities. This paper addresses the problem of predicting protein complexes from the protein-protein interaction (PPI) network of one species using a computational approach. Most of the previous methods rely on the assumption that proteins within the same complex would have relatively more interactions. This translates into dense subgraphs in the PPI network. However, the existing software tools have limited success. Recently, Gavin et al. (2006) provided a detailed study on the organization of protein complexes and suggested that a complex consists of two parts: a core and an attachment. Based on this core-attachment concept, we developed a novel approach to identify complexes from the PPI network by identifying their cores and attachments separately. We evaluated the effectiveness of our proposed approach using three different datasets and compared the quality of our predicted complexes with three existing tools. The evaluation results show that we can predict many more complexes and with higher accuracy than these tools with an improvement of over 30%. To verify the cores we identified in each complex, we compared our cores with the mediators produced by Andreopoulos et al. (2007), which were claimed to be the cores, based on the benchmark result produced by Gavin et al. (2006). We found that the cores we produced are of much higher quality ranging from 10- to 30-fold more correctly predicted cores and with better accuracy. Availability: http://alse.cs.hku.hk/complexes/.Keywords
This publication has 18 references indexed in Scilit:
- Clustering by common friends finds locally significant proteins mediating modulesBioinformatics, 2007
- Evaluation of clustering algorithms for protein-protein interaction networksBMC Bioinformatics, 2006
- CFinder: locating cliques and overlapping modules in biological networksBioinformatics, 2006
- Proteome survey reveals modularity of the yeast cell machineryNature, 2006
- Protein complex prediction via cost-based clusteringBioinformatics, 2004
- An automated method for finding molecular complexes in large protein interaction networksBMC Bioinformatics, 2003
- An efficient algorithm for large-scale detection of protein familiesNucleic Acids Research, 2002
- Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometryNature, 2002
- A comprehensive two-hybrid analysis to explore the yeast protein interactomeProceedings of the National Academy of Sciences, 2001
- Superparamagnetic Clustering of DataPhysical Review Letters, 1996