Interpretation, Design, and Analysis of Gene Array Expression Experiments

Abstract
Experiments using arrays of cDNA targets to compare patterns of gene expression are beginning to play a prominent role in biogerontology, but drawing reliable conclusions from the resulting data sets requires careful application of statistical methods that discriminate chance events from those likely to reflect real differences among the samples under study. This essay discusses flaws in the logic of studies that base their conclusions on ratio calculations alone, reviews the multiple comparison traps inherent in high throughput systems that test a very large number of mRNAs simultaneously, and advocates a two-stage design in which significance testing applied to exploratory data is used to guide a second round of hypothesis-testing experiments conducted in a separate set of experimental samples.