Detecting differential gene expression with a semiparametric hierarchical mixture method

Top Cited Papers
Open Access
Abstract
Mixture modeling provides an effective approach to the differential expression problem in microarray data analysis. Methods based on fully parametric mixture models are available, but lack of fit in some examples indicates that more flexible models may be beneficial. Existing, more flexible, mixture models work at the level of one‐dimensional gene‐specific summary statistics, and so when there are relatively few measurements per gene these methods may not provide sensitive detectors of differential expression. We propose a hierarchical mixture model to provide methodology that is both sensitive in detecting differential expression and sufficiently flexible to account for the complex variability of normalized microarray data. EM‐based algorithms are used to fit both parametric and semiparametric versions of the model. We restrict attention to the two‐sample comparison problem; an experiment involving Affymetrix microarrays and yeast translation provides the motivating case study. Gene‐specific posterior probabilities of differential expression form the basis of statistical inference; they define short gene lists and false discovery rates. Compared to several competing methodologies, the proposed methodology exhibits good operating characteristics in a simulation study, on the analysis of spike‐in data, and in a cross‐validation calculation.