Tackling the widespread and critical impact of batch effects in high-throughput data
Top Cited Papers
Open Access
- 14 September 2010
- journal article
- research article
- Published by Springer Nature in Nature Reviews Genetics
- Vol. 11 (10), 733-739
- https://doi.org/10.1038/nrg2825
Abstract
Batch effects can lead to incorrect biological conclusions but are not widely considered. The authors show that batch effects are relevant to a range of high-throughput 'omics' data sets and are crucial to address. They also explain how batch effects can be mitigated. High-throughput technologies are widely used, for example to assay genetic variants, gene and protein expression, and epigenetic modifications. One often overlooked complication with such studies is batch effects, which occur because measurements are affected by laboratory conditions, reagent lots and personnel differences. This becomes a major problem when batch effects are correlated with an outcome of interest and lead to incorrect conclusions. Using both published studies and our own analyses, we argue that batch effects (as well as other technical and biological artefacts) are widespread and critical to address. We review experimental and computational approaches for doing so.Keywords
This publication has 27 references indexed in Scilit:
- A multilevel model to address batch effects in copy number estimation using SNP arraysBiostatistics, 2010
- Supervised normalization of microarraysBioinformatics, 2010
- Comprehensive genomic characterization defines human glioblastoma genes and core pathwaysNature, 2008
- Cross-generation and cross-laboratory predictions of Affymetrix microarrays by rank-based methodsJournal of Biomedical Informatics, 2008
- A gene expression bar code for microarray dataNature Methods, 2007
- Capturing Heterogeneity in Gene Expression Studies by Surrogate Variable AnalysisPLoS Genetics, 2007
- Common genetic variants account for differences in gene expression among ethnic groupsNature Genetics, 2007
- High-resolution serum proteomic patterns for ovarian cancer detectionEndocrine-Related Cancer, 2004
- The International HapMap ProjectNature, 2003
- Enduring ValuesTechnometrics, 1972