Predicting Protein Complexes from PPI Data: A Core-Attachment Approach

1 February 2009

journal article
research article
Published by Mary Ann Liebert Inc in Journal of Computational Biology

Vol. 16 (2), 133-144
https://doi.org/10.1089/cmb.2008.01tt

Abstract

Protein complexes play a critical role in many biological processes. Identifying the component proteins in a protein complex is an important step in understanding the complex as well as the related biological activities. This paper addresses the problem of predicting protein complexes from the protein-protein interaction (PPI) network of one species using a computational approach. Most of the previous methods rely on the assumption that proteins within the same complex would have relatively more interactions. This translates into dense subgraphs in the PPI network. However, the existing software tools have limited success. Recently, Gavin et al. (2006) provided a detailed study on the organization of protein complexes and suggested that a complex consists of two parts: a core and an attachment. Based on this core-attachment concept, we developed a novel approach to identify complexes from the PPI network by identifying their cores and attachments separately. We evaluated the effectiveness of our proposed approach using three different datasets and compared the quality of our predicted complexes with three existing tools. The evaluation results show that we can predict many more complexes and with higher accuracy than these tools with an improvement of over 30%. To verify the cores we identified in each complex, we compared our cores with the mediators produced by Andreopoulos et al. (2007), which were claimed to be the cores, based on the benchmark result produced by Gavin et al. (2006). We found that the cores we produced are of much higher quality ranging from 10- to 30-fold more correctly predicted cores and with better accuracy. Availability: http://alse.cs.hku.hk/complexes/.

Keywords

This publication has 18 references indexed in Scilit:

Clustering by common friends finds locally significant proteins mediating modules
Bioinformatics, 2007
Evaluation of clustering algorithms for protein-protein interaction networks
BMC Bioinformatics, 2006
CFinder: locating cliques and overlapping modules in biological networks
Bioinformatics, 2006
Proteome survey reveals modularity of the yeast cell machinery
Nature, 2006
Protein complex prediction via cost-based clustering
Bioinformatics, 2004
An automated method for finding molecular complexes in large protein interaction networks
BMC Bioinformatics, 2003
An efficient algorithm for large-scale detection of protein families
Nucleic Acids Research, 2002
Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry
Nature, 2002
A comprehensive two-hybrid analysis to explore the yeast protein interactome
Proceedings of the National Academy of Sciences, 2001
Superparamagnetic Clustering of Data
Physical Review Letters, 1996

Cited by 195 articles