Adjustment of systematic microarray data biases

Top Cited Papers
Open Access
Abstract
Motivation: Systematic differences due to experimental features of microarray experiments are present in most large microarray data sets. Many different experimental features can cause biases including different sources of RNA, different production lots of microarrays or different microarray platforms. These systematic effects present a substantial hurdle to the analysis of microarray data. Results: We present here a new method for the identification and adjustment of systematic biases that are present within microarray data sets. Our approach is based on modern statistical discrimination methods and is shown to be very effective in removing systematic biases present in a previously published breast tumor cDNA microarray data set. The new method of ‘Distance Weighted Discrimination (DWD)’ is shown to be better than Support Vector Machines and Singular Value Decomposition for the adjustment of systematic microarray effects. In addition, it is shown to be of general use as a tool for the discrimination of systematic problems present in microarray data sets, including the merging of two breast tumor data sets completed on different microarray platforms. Availability: Matlab software to perform DWD can be retrieved from https://genome.unc.edu/pubsup/dwd/ Supplementary information: The complete figures that represent the cluster diagrams in Figure 6 and other figures are available at https://genome.unc.edu/pubsup/dwd/