Adjusting batch effects in microarray expression data using empirical Bayes methods

Top Cited Papers

Open Access

21 April 2006

journal article
Published by Oxford University Press (OUP) in Biostatistics

Vol. 8 (1), 118-127
https://doi.org/10.1093/biostatistics/kxj037

Abstract

Non-biological experimental variation or “batch effects" are commonly observed across multiple batches of microarray experiments, often rendering the task of combining data from these batches difficult. The ability to combine microarray data sets is advantageous to researchers to increase statistical power to detect biological phenomena from studies where logistical considerations restrict sample size or in studies that require the sequential hybridization of arrays. In general, it is inappropriate to combine data sets without adjusting for batch effects. Methods have been proposed to filter batch effects from data, but these are often complicated and require large batch sizes ( ) to implement. Because the majority of microarray studies are conducted using much smaller sample sizes, existing methods are not sufficient. We propose parametric and non-parametric empirical Bayes frameworks for adjusting data for batch effects that is robust to outliers in small sample sizes and performs comparable to existing methods for large samples. We illustrate our methods using two example data sets and show that our methods are justifiable, easy to apply, and useful in practice. Software for our method is freely available at: http://biosun1.harvard.edu/complab/batch/.

Keywords

This publication has 14 references indexed in Scilit:

Bayesian Robust Inference for Differential Gene Expression in Microarrays with Multiple Samples
Biometrics, 2005
Empirical Bayes Microarray ANOVA and Grouping Cell Lines by Equal Expression Levels
Statistical Applications in Genetics and Molecular Biology, 2005
Large-scale meta-analysis of cancer microarray data identifies common transcriptional profiles of neoplastic transformation and progression
Proceedings of the National Academy of Sciences, 2004
Adjustment of systematic microarray data biases
Bioinformatics, 2004
On parametric empirical Bayes methods for comparing multiple groups using replicated gene expression profiles
Statistics in Medicine, 2003
Exploration, normalization, and summaries of high density oligonucleotide array probe level data
Biostatistics, 2003
Empirical Bayes Analysis of a Microarray Experiment
Journal of the American Statistical Association, 2001
On Differential Variability of Expression Ratios: Improving Statistical Inference about Gene Expression Changes from Microarray Data
Journal of Computational Biology, 2001
Singular value decomposition for genome-wide expression data processing and modeling
Proceedings of the National Academy of Sciences, 2000
Array of hope
Nature Genetics, 1999

Cited by 7017 articles