Network-based functional enrichment
Open Access
- 30 November 2011
- journal article
- Published by Springer Nature in BMC Bioinformatics
- Vol. 12 (S13), 1-S14
- https://doi.org/10.1186/1471-2105-12-s13-s14
Abstract
Background: Many methods have been developed to infer and reason about molecular interaction networks. These approaches often yield networks with hundreds or thousands of nodes and up to an order of magnitude more edges. It is often desirable to summarize the biological information in such networks. A very common approach is to use gene function enrichment analysis for this task. A major drawback of this method is that it ignores information about the edges in the network being analyzed, i.e., it treats the network simply as a set of genes. In this paper, we introduce a novel method for functional enrichment that explicitly takes network interactions into account. Results: Our approach naturally generalizes Fisher’s exact test, a gene set-based technique. Given a function of interest, we compute the subgraph of the network induced by genes annotated to this function. We use the sequence of sizes of the connected components of this sub-network to estimate its connectivity. We estimate the statistical significance of the connectivity empirically by a permutation test. We present three applications of our method: i) determine which functions are enriched in a given network, ii) given a network and an interesting sub-network of genes within that network, determine which functions are enriched in the sub-network, and iii) given two networks, determine the functions for which the connectivity improves when we merge the second network into the first. Through these applications, we show that our approach is a natural alternative to network clustering algorithms. Conclusions: We presented a novel approach to functional enrichment that takes into account the pairwise relationships among genes annotated by a particular function. Each of the three applications discovers highly relevant functions. We used our methods to study biological data from three different organisms. Our results demonstrate the wide applicability of our methods. Our algorithms are implemented in C++ and are freely available under the GNU General Public License at our supplementary website. Additionally, all our input data and results are available at http://bioinformatics.cs.vt.edu/~murali/supplements/2011-incob-nbe/.Keywords
This publication has 33 references indexed in Scilit:
- A Comparative Study of Genome-Wide Transcriptional Profiles of Primary Hepatocytes in Collagen Sandwich and Monolayer CulturesTissue Engineering, Part C: Methods, 2010
- Signaling pathways that control cell migration: models and analysisWires Systems Biology and Medicine, 2010
- BioNet: an R-Package for the functional analysis of biological networksBioinformatics, 2010
- Genome-wide identification of post-translational modulators of transcription factor activity in human B cellsNature Biotechnology, 2009
- The Synergizer service for translating gene, protein and other biological identifiersBioinformatics, 2008
- Identifying functional modules in protein–protein interaction networks: an integrated exact approachBioinformatics, 2008
- How to infer gene networks from expression profilesMolecular Systems Biology, 2007
- Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profilesProceedings of the National Academy of Sciences, 2005
- Towards a proteome-scale map of the human protein–protein interaction networkNature, 2005
- Superoxide Dismutase Activity Is Essential for Stationary Phase Survival in Saccharomyces cerevisiaeJournal of Biological Chemistry, 1996