Assessment of Clinical Performance during Simulated Crises Using Both Technical and Behavioral Ratings

Abstract
Background Techniques are needed to assess anesthesiologists' performance when responding to critical events. Patient simulators allow presentation of similar crisis situations to different clinicians. This study evaluated ratings of performance, and the interrater variability of the ratings, made by multiple independent observers viewing videotapes of simulated crises. Methods Raters scored the videotapes of 14 different teams that were managing two scenarios: malignant hyperthermia (MH) and cardiac arrest. Technical performance and crisis management behaviors were rated. Technical ratings could range from 0.0 to 1.0 based on scenario-specific checklists of appropriate actions. Ratings of 12 crisis management behaviors were made using a five-point ordinal scale. Several statistical assessments of interrater variability were applied. Results Technical ratings were high for most teams in both scenarios (0.78 +/- 0.08 for MH, 0.83 +/- 0.06 for cardiac arrest). Ratings of crisis management behavior varied, with some teams rated as minimally acceptable or poor (28% for MH, 14% for cardiac arrest). The agreement between raters was fair to excellent, depending on the item rated and the statistical test used. Conclusions Both technical and behavioral performance can be assessed from videotapes of simulations. The behavioral rating system can be improved; one particular difficulty was aggregating a single rating for a behavior that fluctuated over time. These performance assessment tools might be useful for educational research or for tracking a resident's progress. The rating system needs more refinement before it can be used to assess clinical competence for residency graduation or board certification.