Comparative accuracy: assessing new tests against existing diagnostic pathways

Abstract
Replacement New tests may differ from existing ones in various ways (table 1). They may be more accurate, less invasive, easier to do, less risky, less uncomfortable for patients, quicker to yield results, technically less challenging, or more easily interpreted. View this table: In this window In a new window Table 1 Some features of three sets of diagnostic tests For example, biomarkers for prostate cancer have recently been proposed as a more accurate replacement for prostate specific antigen. A rapid blood test that detects individual activated effector T cells (SPOT-TB) has been introduced as a better way to diagnose tuberculosis than the tuberculin skin test. Myelography has been replaced in most centres by magnetic resonance imaging to detect spinal cord injuries, not only because it provides detailed images, but also because it is simpler, safer, and does not require exposure to radiation (table 2). View this table: In this window In a new window Table 2 Examples of proposed replacement, triage, and add-on diagnostic tests Study designs To find out whether a new test can replace an existing one, the diagnostic accuracy of both tests has to be compared. As the sensitivity and specificity of a test can vary across subgroups, the tests must be evaluated in comparable groups or, preferably, in the same patients.3 Studies of comparative accuracy compare the new test with existing tests and verify test results against the same reference standard. One possibility is a paired study, in which a set of patients is tested with the existing test, the new test, and the reference standard. Another option is a randomised controlled trial, in which patients are randomly allocated to have either the existing test or the new test, after which all patients are assessed with the reference standard. A paired study design has several advantages over a randomised trial: the patients evaluated by both tests are absolutely comparable and it may be possible to use fewer patients. Randomised trials are preferred if tests are too invasive for the old and new tests to be done in the same patients; if the tests interfere with each other, or when the study has other objectives, such as assessing adverse events, the participation of patients in testing, the actions of practitioners, or patient outcomes. Randomised controlled trials are currently being used to compare—for example—point of care cardiac markers with routine testing for the evaluation of acute coronary syndrome. Full verification of all test results in a paired study is not always necessary to find out whether a test can act as a replacement. For example, one study compared testing for human papillomavirus DNA in self collected vaginal swabs with Papanicolaou smears to detect cervical disease and performed colposcopy (the reference standard) in all patients who tested positive on one or both of these tests.4 For that reason, the sensitivity and specificity of the two tests could not be calculated, but the relative true and false positive rates could still be estimated, which allowed the accuracy of the two tests to be compared against the reference standard.5 6 7 Triage In triage, the new test is used before the existing test or testing pathway, and only patients with a particular result on the triage test continue the testing pathway (figure). Triage tests may be less accurate than existing ones and may not be meant to replace them. They have other advantages, such as simplicity or low cost. View larger version: In this window In a new window Roles of tests and positions in existing diagnostic pathways An example of a triage instrument is the set of Ottawa ankle rules, a simple decision aid for use when ankle fractures are suspected.8 Patients who test negative on the ankle rules (the triage test) do not need radiography (the existing test) as this makes a fracture of the malleolus or the midfoot unlikely. Another example is plasma D-dimer in the diagnosis of suspected pulmonary embolism. Patients with a low clinical probability of pulmonary embolism and a negative D-dimer result may not need computed tomography, as pulmonary embolism can be ruled out (table 2).9 Study designs The triage test does not aim to improve the diagnostic accuracy of the current pathway. Rather, it reduces the use of existing tests that are more invasive, cumbersome, or expensive. Several designs can be used to compare the accuracy of the triage strategy with that of the existing test. In a fully paired study design, all patients undergo the triage test, the existing test, and the reference standard. Designs with limited verification can be used here as well, as the primary concern is to find out whether disease will be missed with the triage test and how efficient the triage test is. One option is to use a paired design and verify the results only of patients who test negative on the triage test but positive on the existing test. This will identify patients in whom disease will be missed if the triage test is used as well as patients in whom the existing test can be avoided.