Methods for assessing reproducibility of clustering patterns observed in analyses of microarray data

Abstract
Motivation: Recent technological advances such as cDNA microarray technology have made it possible to simultaneously interrogate thousands of genes in a biological specimen. A cDNA microarray experiment produces a gene expression ‘profile’. Often interest lies in discovering novel subgroupings, or ‘clusters’, of specimens based on their profiles, for example identification of new tumor taxonomies. Cluster analysis techniques such as hierarchical clustering and self-organizing maps have frequently been used for investigating structure in microarray data. However, clustering algorithms always detect clusters, even on random data, and it is easy to misinterpret the results without some objective measure of the reproducibility of the clusters. Results: We present statistical methods for testing for overall clustering of gene expression profiles, and we define easily interpretable measures of cluster-specific reproducibility that facilitate understanding of the clustering structure. We apply these methods to elucidate structure in cDNA microarray gene expression profiles obtained on melanoma tumors and on prostate specimens. Availability: Software to implement these methods is contained in BRB ArrayTools microarray analysis package available from http://linus.nci.nih.gov./BRB-ArrayTools.html Contact: lm5h@nih.gov

This publication has 1 reference indexed in Scilit: