Abstract
New diagnostic tests frequently are evaluated against gold standards that are assumed to classify patients with unerring accuracy according to the presence or absence of disease. In practice, gold standards rarely are perfect predictors of disease and tend to misclassify a small number of patients. When an imperfect standard is used to evaluate a diagnostic test, many commonly used measures of test performance are distorted. It is not widely appreciated that these distortions occur in predictable directions and that they may be of considerable magnitude, even when the gold standard has a high degree of accuracy. The diagnostic powers of clinical tests will be more accurately reported if consideration is given to the types of biases that result from the use of imperfect standards. Several different approaches may be used to minimize these distortions when evaluating new tests.