Supervised normalization of microarrays
Open Access
- 31 March 2010
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 26 (10), 1308-1315
- https://doi.org/10.1093/bioinformatics/btq118
Abstract
Motivation: A major challenge in utilizing microarray technologies to measure nucleic acid abundances is ‘normalization’, the goal of which is to separate biologically meaningful signal from other confounding sources of signal, often due to unavoidable technical factors. It is intuitively clear that true biological signal and confounding factors need to be simultaneously considered when performing normalization. However, the most popular normalization approaches do not utilize what is known about the study, both in terms of the biological variables of interest and the known technical factors in the study, such as batch or array processing date. Results: We show here that failing to include all study-specific biological and technical variables when performing normalization leads to biased downstream analyses. We propose a general normalization framework that fits a study-specific model employing every known variable that is relevant to the expression study. The proposed method is generally applicable to the full range of existing probe designs, as well as to both single-channel and dual-channel arrays. We show through real and simulated examples that the method has favorable operating characteristics in comparison to some of the most highly used normalization methods. Availability: An R package called snm implementing the methodology will be made available from Bioconductor (http://bioconductor.org). Contact:jstorey@princeton.edu Supplementary information: Supplementary data are available at Bioinformatics online.Keywords
This publication has 18 references indexed in Scilit:
- A general framework for multiple testing dependenceProceedings of the National Academy of Sciences, 2008
- A statistical framework for the analysis of microarray probe-level dataThe Annals of Applied Statistics, 2007
- Capturing Heterogeneity in Gene Expression Studies by Surrogate Variable AnalysisPLoS Genetics, 2007
- Normalization of two-channel microarrays accounting for experimental design and intensity-dependent relationshipsGenome Biology, 2007
- Extracellular matrix gene expression in the developing mouse aortaPublished by Elsevier ,2005
- A Model-Based Background Adjustment for Oligonucleotide Expression ArraysJournal of the American Statistical Association, 2004
- Normalization of microarray data using a spatial mixed model analysis which includes splinesBioinformatics, 2004
- Statistical significance for genomewide studiesProceedings of the National Academy of Sciences, 2003
- A Direct Approach to False Discovery RatesJournal of the Royal Statistical Society Series B: Statistical Methodology, 2002
- Assessing Gene Significance from cDNA Microarray Expression Data via Mixed ModelsJournal of Computational Biology, 2001