Performance of “standardized examinees” in a standardized-patient examination of clinical skills

Abstract
PURPOSE: As a first step in testing the utility of using trained “standardized examinees” (SEs) as a quality-assurance measure for the scoring process in a standardized-patient (SP) examination, to test whether medical residents could simulate students in an SP examination and perform consistently to specified levels under test conditions. METHOD: Fourth-year students from the Baltimore-Washington Consortium for SPs participated in a National Board of Medical Examiners Prototype Examination of clinical skills consisting of twelve 15-minute student-patient encounters in 1994-95. For this examination, internal medicine residents were trained to act as ordinary candidates and to achieve target scores by performing to a set level on specific checklist items used by SPs for recording interviewing, physical-examination, and communication skills. The “strong SEs” were trained to score 80% correct on six of the examination's 12 cases (study cases), and the “weak SEs” were trained to score 40% correct on the same six cases. The strong and weak SEs' checklist scores on the study cases were compared through independent, two-tailed t-tests. When there was less than 85% agreement on specific checklist items in each case between the SE training and the SP recording, videotapes of the cases were reviewed; in such cases an SE's performance was the final score agreed upon after review. RESULTS: Seven SEs took the SP examination and were not detected by the SPs. There was a total of 84 discrepancies between predicted and recorded checklist scores across 659 checklist items in 40 encounters scored by the SPs. After correcting the discrepancies based on videotape review, the estimated actual mean score was 77.3% for the strong SEs and 44.0% for the weak SEs, and was higher for the strong SEs in each study case. The overall fidelity of the SEs to their training was estimated to be 97%, and the overall SP accuracy was estimated to be 91%. The videotape review revealed 47 training-scoring discrepancies, most in the area of communication skills. CONCLUSION: This study suggests that SEs can be trained to specific performance levels and may be an effective internal control for a high-stakes SP examination. They may also provide a mechanism for refining scoring checklists and for exploring the validity of SP examinations.