Finding Genes in the C2C12 Osteogenic Pathway by k-Nearest-Neighbor Classification of Expression Data

Open Access

1 January 2002

journal article
Published by Cold Spring Harbor Laboratory in Genome Research

Vol. 12 (1), 165-176
https://doi.org/10.1101/gr.182601

Abstract

A supervised classification scheme for analyzing microarray expression data, based on the k-nearest-neighbor method coupled to noise-reduction filters, has been used to find genes involved in the osteogenic pathway of the mouse C2C12 cell line studied here as a model for in vivo osteogenesis. The scheme uses as input a training set embodying expert biological knowledge, and provides internal estimates of its own misclassification errors, which furthermore enables systematic optimization of the classifier parameters. On the basis of the C2C12-generated expression data set with 34,130 expression profiles across 2 time courses, each comprised of 6 points, and a training set containing known members of the osteogenic, myoblastic, and adipocytic pathways, 176 new genes in addition to 28 originally in the training set are selected as relevant to osteogenesis. For this selection, the estimated sensitivity is 42% and the posterior false-positive rate (fraction of candidates that are spurious) is 12%. The corresponding sensitivity and false-positive rate for detection of myoblastic genes are 9% and 31%, respectively, and only 4% and ∼100%, respectively, for adipocytic genes, in accordance with an experimental design that predominantly stimulated the osteogenic pathway. Validation of this selection is provided by examining expression of the genes in an independent biological assay involving mouse calvaria (skull bone) primary cell cultures, in which a large fraction of the 176 genes are seen to be strongly regulated, as well as by case-by-case analysis of the genes on the basis of expert domain knowledge. The methodology should be generalizable to any situation in which enough a priori biological knowledge exists to define a training set. [Online supplementary material available at www.genome.org]

Keywords

This publication has 45 references indexed in Scilit:

Significance analysis of microarrays applied to the ionizing radiation response
Proceedings of the National Academy of Sciences, 2001
A Comparison of the Expression Pattern of Five Genes of the Family of Small Leucine-Rich Proteoglycans During Mouse Development
Journal of Bone and Mineral Research, 2000
Systematic variation in gene expression patterns in human cancer cell lines
Nature Genetics, 2000
Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling
Nature, 2000
Differential regulation of syndecan expression by osteosarcoma cell lines in response to cytokines but not osteotropic hormones
Bone, 1999
Selenoproteins Are Expressed in Fetal Human Osteoblast-like Cells
Biochemical and Biophysical Research Communications, 1998
Cystatin C, an inhibitor of bone resorption produced by osteoblasts
Acta Physiologica Scandinavica, 1997
Prostaglandin E receptor subtypes in mouse osteoblastic cell line
Endocrinology, 1996
Thiazolidinediones and Fatty Acids Convert Myogenic Cells into Adipose-like Cells
Published by Elsevier ,1995
Cytoplasmic activation of human nuclear genes in stable heterocaryons
Cell, 1983

Cited by 56 articles