Identifying functional modules in protein–protein interaction networks: an integrated exact approach
Top Cited Papers
Open Access
- 1 July 2008
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 24 (13), i223-i231
- https://doi.org/10.1093/bioinformatics/btn161
Abstract
Motivation: With the exponential growth of expression and protein–protein interaction (PPI) data, the frontier of research in systems biology shifts more and more to the integrated analysis of these large datasets. Of particular interest is the identification of functional modules in PPI networks, sharing common cellular function beyond the scope of classical pathways, by means of detecting differentially expressed regions in PPI networks. This requires on the one hand an adequate scoring of the nodes in the network to be identified and on the other hand the availability of an effective algorithm to find the maximally scoring network regions. Various heuristic approaches have been proposed in the literature. Results: Here we present the first exact solution for this problem, which is based on integer-linear programming and its connection to the well-known prize-collecting Steiner tree problem from Operations Research. Despite the NP-hardness of the underlying combinatorial problem, our method typically computes provably optimal subnetworks in large PPI networks in a few minutes. An essential ingredient of our approach is a scoring function defined on network nodes. We propose a new additive score with two desirable properties: (i) it is scalable by a statistically interpretable parameter and (ii) it allows a smooth integration of data from various sources. We apply our method to a well-established lymphoma microarray dataset in combination with associated survival data and the large interaction network of HPRD to identify functional modules by computing optimal-scoring subnetworks. In particular, we find a functional interaction module associated with proliferation over-expressed in the aggressive ABC subtype as well as modules derived from non-malignant by-stander cells. Availability: Our software is available freely for non-commercial purposes at http://www.planet-lisa.net. Contact:tobias.mueller@biozentrum.uni-wuerzburg.deKeywords
This publication has 30 references indexed in Scilit:
- Integration of biological networks and gene expression data using CytoscapeNature Protocols, 2007
- STRING 7--recent developments in the integration and prediction of protein interactionsNucleic Acids Research, 2006
- Græmlin: General and robust alignment of multiple large interaction networksGenome Research, 2006
- Modeling cellular machinery through biological network comparisonNature Biotechnology, 2006
- Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray ExperimentsStatistical Applications in Genetics and Molecular Biology, 2004
- Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction NetworksGenome Research, 2003
- Development of Human Protein Reference Database as an Initial Platform for Approaching Systems Biology in HumansGenome Research, 2003
- Conserved pathways within bacteria and yeast as revealed by global protein network alignmentProceedings of the National Academy of Sciences, 2003
- Martingale-based residuals for survival modelsBiometrika, 1990
- Cox's Regression Model for Counting Processes: A Large Sample StudyThe Annals of Statistics, 1982