A Paradigm for Class Prediction Using Gene Expression Profiles
- 1 June 2002
- journal article
- research article
- Published by Mary Ann Liebert Inc in Journal of Computational Biology
- Vol. 9 (3), 505-511
- https://doi.org/10.1089/106652702760138592
Abstract
We propose a general framework for prediction of predefined tumor classes using gene expression profiles from microarray experiments. The framework consists of 1) evaluating the appropriateness of class prediction for the given data set, 2) selecting the prediction method, 3) performing cross-validated class prediction, and 4) assessing the significance of prediction results by permutation testing. We describe an application of the prediction paradigm to gene expression profiles from human breast cancers, with specimens classified as positive or negative for BRCA1 mutations and also for BRCA2 mutations. In both cases, the accuracy of class prediction was statistically significant when compared to the accuracy of prediction expected by chance. The framework proposed here for the application of class prediction is designed to reduce the occurrence of spurious findings, a legitimate concern for high-dimensional microarray data. The prediction paradigm will serve as a good framework for comparing different prediction methods and may accelerate the development of molecular classifiers that are clinically useful.Keywords
This publication has 11 references indexed in Scilit:
- Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression DataJournal of the American Statistical Association, 2002
- Gene-Expression Profiles in Hereditary Breast CancerNew England Journal of Medicine, 2001
- Support vector machine classification and validation of cancer tissue samples using microarray expression dataBioinformatics, 2000
- Gene expression data analysisFEBS Letters, 2000
- Tissue Classification with Gene Expression ProfilesJournal of Computational Biology, 2000
- Knowledge-based analysis of microarray gene expression data by using support vector machinesProceedings of the National Academy of Sciences, 2000
- Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression MonitoringScience, 1999
- Estimating the Error Rate of a Prediction Rule: Improvement on Cross-ValidationJournal of the American Statistical Association, 1983
- Estimation of Error Rates in Discriminant AnalysisTechnometrics, 1968
- On the Theory of Some Non-Parametric HypothesesThe Annals of Mathematical Statistics, 1949