Gene expression–based survival prediction in lung adenocarcinoma: a multi-site, blinded validation study

Abstract
Studies of gene expression in lung cancer have the potential to affect patient care, but the general applicability of the derived classifiers is unclear. David Beer and his colleagues now analyze more than 400 lung tumors from subjects at six institutions using eight different classifiers and show that the combination of molecular and clinical data best predicts patient survival ( pages 812–813 ). Although prognostic gene expression signatures for survival in early-stage lung cancer have been proposed, for clinical application, it is critical to establish their performance across different subject populations and in different laboratories. Here we report a large, training–testing, multi-site, blinded validation study to characterize the performance of several prognostic models based on gene expression for 442 lung adenocarcinomas. The hypotheses proposed examined whether microarray measurements of gene expression either alone or combined with basic clinical covariates (stage, age, sex) could be used to predict overall survival in lung cancer subjects. Several models examined produced risk scores that substantially correlated with actual subject outcome. Most methods performed better with clinical data, supporting the combined use of clinical and molecular information when building prognostic models for early-stage lung cancer. This study also provides the largest available set of microarray data with extensive pathological and clinical annotation for lung adenocarcinomas.