Interactive Exploration of Microarray Gene Expression Patterns in a Reduced Dimensional Space
Open Access
- 1 July 2002
- journal article
- research article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 12 (7), 1112-1120
- https://doi.org/10.1101/gr.225302
Abstract
The very high dimensional space of gene expression measurements obtained by DNA microarrays impedes the detection of underlying patterns in gene expression data and the identification of discriminatory genes. In this paper we show the use of projection methods such as principal components analysis (PCA) to obtain a direct link between patterns in the genes and patterns in samples. This feature is useful in the initial interactive pattern exploration of gene expression data and data-driven learning of the nature and types of samples. Using oligonucleotide microarray measurements of 40 samples from different normal human tissues, we show that distinct patterns are obtained when the genes are projected on a two-dimensional plane spanned by the loadings of the two major principal components. These patterns define the particular genes associated with a sample class (i.e., tissue). When used separately from the other genes, these class-specific (i.e., tissue-specific) genes in turn define distinct tissue patterns in the projection space spanned by the scores of the two major principal components. In this study, PCA projection facilitated discriminatory gene selection for different tissues and identified tissue-specific gene expression signatures for liver, skeletal muscle, and brain samples. Furthermore, it allowed the classification of nine new samples belonging to these three types using the linear combination of the expression levels of the tissue-specific genes determined from the first set of samples. The application of the technique to other published data sets is also discussed. [Online supplementary material available atwww.genome.org.]Keywords
This publication has 17 references indexed in Scilit:
- Mapping physiological states from microarray expression measurementsBioinformatics, 2002
- A compendium of gene expression in normal human tissuesPhysiological Genomics, 2001
- Singular value decomposition for genome-wide expression data processing and modelingProceedings of the National Academy of Sciences, 2000
- Fundamental patterns underlying gene expression profiles: Simplicity from complexityProceedings of the National Academy of Sciences, 2000
- Functional Discovery via a Compendium of Expression ProfilesCell, 2000
- Distinct types of diffuse large B-cell lymphoma identified by gene expression profilingNature, 2000
- Distinctive gene expression patterns in human mammary epithelial cells and breast cancersProceedings of the National Academy of Sciences, 1999
- Comprehensive Identification of Cell Cycle–regulated Genes of the YeastSaccharomyces cerevisiaeby Microarray HybridizationMolecular Biology of the Cell, 1998
- Adaptive batch monitoring using hierarchical PCAChemometrics and Intelligent Laboratory Systems, 1998
- Nucleotide sequence and derived amino acid sequence of a cDNA encoding human muscle carbonic anhydraseGene, 1986