Statistical methods for the assessment of prognostic biomarkers (Part I): Discrimination

Abstract
The objective of all diagnostic and therapeutic decisions taken by physicians in the care of patients is that of improving prognosis by preventing or modifying the natural evolution of the disease. Therefore, prognostic research is an important area of investigation in clinical epidemiology. A clinician usually formulates a prognosis on the basis of patient characteristics or biomarkers, i.e. clinical indicators of various sort (for example serum creatinine, arterial pressure, left ventricular mass as measured by echocardiography, etc.), reflecting normal or pathological pathways related to the exposure–outcome axis or to biological responses to a therapeutic intervention. Although biomarkers can be classified into different categories, here we only consider biomarkers of exposure. An ideal prognostic biomarker should allow the early identification of individuals at risk for a given outcome, and should be relatively easy to measure with acceptable costs. Before being introduced into clinical practice, biomarkers need to be properly validated. Formal proof that the use/modification of the same biomarker is able to affect/predict the prognosis at an individual or population level is a fundamental step in the validation process of a biomarker. Multivariate modelling of pertinent clinical outcomes by the combination of multiple biomarkers and other indicators allows the generation of risk prediction scores. The most popular of these risk scores is the Framingham risk score (FRS), a score based on demographic information (age, sex), an environmental exposure (smoking) and three biomarkers (systolic pressure, total cholesterol and high-density lipoprotein (HDL) cholesterol). FRS estimates the individual risk of myocardial infarction and coronary death in 10 years. Accuracy and generalizability are important issues related to the use of prognostic biomarkers and risk prediction scores as well. Accuracy is the agreement between the outcome predicted by the biomarker and the actual occurrence of the outcome. Generalizability is the capacity of the same biomarker to provide accurate predictions in population samples different from that in which the biomarker was originally validated. There are three commonly used methods to assess the accuracy of biomarkers for predicting clinical outcomes: discrimination, calibration and reclassification. In this article, we focus on discrimination, and in the next one, on calibration and reclassification.