Meta-Analysis of Drosophila Circadian Microarray Studies Identifies a Novel Set of Rhythmically Expressed Genes

Abstract
Five independent groups have reported microarray studies that identify dozens of rhythmically expressed genes in the fruit fly Drosophila melanogaster. Limited overlap among the lists of discovered genes makes it difficult to determine which, if any, exhibit truly rhythmic patterns of expression. We reanalyzed data from all five reports and found two sources for the observed discrepancies, the use of different expression pattern detection algorithms and underlying variation among the datasets. To improve upon the methods originally employed, we developed a new analysis that involves compilation of all existing data, application of identical transformation and standardization procedures followed by ANOVA-based statistical prescreening, and three separate classes of post hoc analysis: cross-correlation to various cycling waveforms, autocorrelation, and a previously described fast Fourier transform–based technique [1–3]. Permutation-based statistical tests were used to derive significance measures for all post hoc tests. We find application of our method, most significantly the ANOVA prescreening procedure, significantly reduces the false discovery rate relative to that observed among the results of the original five reports while maintaining desirable statistical power. We identify a set of 81 cycling transcripts previously found in one or more of the original reports as well as a novel set of 133 transcripts not found in any of the original studies. We introduce a novel analysis method that compensates for variability observed among the original five Drosophila circadian array reports. Based on the statistical fidelity of our meta-analysis results, and the results of our initial validation experiments (quantitative RT-PCR), we predict many of our newly found genes to be bona fide cyclers, and suggest that they may lead to new insights into the pathways through which clock mechanisms regulate behavioral rhythms. Circadian genes regulate many of life's most essential processes, from sleeping and eating to cellular metabolism, learning, and much more. Many of these genes exhibit cyclic transcript expression, a characteristic utilized by an ever-expanding corpus of microarray-based studies to discover additional circadian genes. While these attempts have identified hundreds of transcripts in a variety of organisms, they exhibit a striking lack of agreement, making it difficult to determine which, if any, are truly cycling. Here, we examine one group of these reports (those performed on the fruit fly—Drosophila melanogaster) to identify the sources of observed differences and present a means of analyzing the data that drastically reduces their impact. We demonstrate the fidelity of our method through its application to the original fruit fly microarray data, detecting more than 200 (133 novel) transcripts with a level of statistical fidelity better than that found in any of the original reports. Initial validation experiments (quantitative RT-PCR) suggest these to be truly cycling genes, one of which is now known to be a bona fide circadian gene (cwo). We report the discovery of 133 novel candidate circadian genes as well as the highly adaptable method used to identify them.