A general modular framework for gene set enrichment analysis
Top Cited Papers
Open Access
- 3 February 2009
- journal article
- research article
- Published by Springer Nature in BMC Bioinformatics
- Vol. 10 (1), 1-20
- https://doi.org/10.1186/1471-2105-10-47
Abstract
Analysis of microarray and other high-throughput data on the basis of gene sets, rather than individual genes, is becoming more important in genomic studies. Correspondingly, a large number of statistical approaches for detecting gene set enrichment have been proposed, but both the interrelations and the relative performance of the various methods are still very much unclear. We conduct an extensive survey of statistical approaches for gene set analysis and identify a common modular structure underlying most published methods. Based on this finding we propose a general framework for detecting gene set enrichment. This framework provides a meta-theory of gene set analysis that not only helps to gain a better understanding of the relative merits of each embedded approach but also facilitates a principled comparison and offers insights into the relative interplay of the methods. We use this framework to conduct a computer simulation comparing 261 different variants of gene set enrichment procedures and to analyze two experimental data sets. Based on the results we offer recommendations for best practices regarding the choice of effective procedures for gene set enrichment analysis.Keywords
This publication has 47 references indexed in Scilit:
- Microarray-based gene set analysis: a comparison of current methodsBMC Bioinformatics, 2008
- Comparative evaluation of gene-set analysis methodsBMC Bioinformatics, 2007
- ProbCD: enrichment analysis accounting for categorization uncertaintyBMC Bioinformatics, 2007
- Computation of significance scores of unweighted Gene Set Enrichment AnalysesBMC Bioinformatics, 2007
- Improving gene set analysis of microarray data by SAM-GSBMC Bioinformatics, 2007
- Random-set methods identify distinct aspects of the enrichment signal in gene-set analysisThe Annals of Applied Statistics, 2007
- A multivariate approach for integrating genome-wide expression data and biological knowledgeBioinformatics, 2006
- Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profilesProceedings of the National Academy of Sciences, 2005
- PGC-1α-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetesNature Genetics, 2003
- Significance analysis of microarrays applied to the ionizing radiation responseProceedings of the National Academy of Sciences, 2001