LRpath: a logistic regression approach for identifying enriched biological groups in gene expression data
- 27 November 2008
- journal article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 25 (2), 211-217
- https://doi.org/10.1093/bioinformatics/btn592
Abstract
The elucidation of biological pathways enriched with differentially expressed genes has become an integral part of the analysis and interpretation of microarray data. Several statistical methods are commonly used in this context, but the question of the optimal approach has still not been resolved. We present a logistic regression-based method (LRpath) for identifying predefined sets of biologically related genes enriched with (or depleted of) differentially expressed transcripts in microarray experiments. We functionally relate the odds of gene set membership with the significance of differential expression, and calculate adjusted P-values as a measure of statistical significance. The new approach is compared with Fisher's exact test and other relevant methods in a simulation study and in the analysis of two breast cancer datasets. Overall results were concordant between the simulation study and the experimental data analysis, and provide useful information to investigators seeking to choose the appropriate method. LRpath displayed robust behavior and improved statistical power compared with tested alternatives. It is applicable in experiments involving two or more sample types, and accepts significance statistics of the investigator's choice as input.Keywords
This publication has 26 references indexed in Scilit:
- ProbCD: enrichment analysis accounting for categorization uncertaintyBMC Bioinformatics, 2007
- Random-set methods identify distinct aspects of the enrichment signal in gene-set analysisThe Annals of Applied Statistics, 2007
- Gene Expression Profiling in Breast Cancer: Understanding the Molecular Basis of Histologic Grade To Improve PrognosisJNCI Journal of the National Cancer Institute, 2006
- Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profilesProceedings of the National Academy of Sciences, 2005
- Up-Regulation and Profibrotic Role of Osteopontin in Human Idiopathic Pulmonary FibrosisPLoS Medicine, 2005
- An expression signature for p53 status in human breast cancer predicts mutation status, transcriptional effects, and patient survivalProceedings of the National Academy of Sciences, 2005
- Pathways to the analysis of microarray dataTrends in Biotechnology, 2005
- Ontological analysis of gene expression data: current tools, limitations, and open problemsBioinformatics, 2005
- Statistical significance for genomewide studiesProceedings of the National Academy of Sciences, 2003
- Global functional profiling of gene expression☆☆This work was funded in part by a Sun Microsystems grant awarded to S.D., NIH Grant HD36512 to S.A.K., a Wayne State University SOM Dean’s Post-Doctoral Fellowship, and an NICHD Contraception and Infertility Loan to G.C.O. Support from the WSU MCBI mode is gratefully appreciated.Genomics, 2003