Sensitivity of the modified Angoff standard‐setting method to variations in item content

Abstract
This study examined the sensitivity of the modified Angoff standard‐setting method to differences in item content when examinee performance data were provided for the judges. Three medical certifying examinations contained subsets of “core”; items. Compared to the remainder of the examinations, the core subsets were written to test points that were relatively more critical for patient outcome, more frequently encountered, and/or where there was some evidence that intervention would have an effect on patient outcome. Results comparing standardized distances showed that judges set tougher standards for the core items than for the noncore items and that the differences between performance data and standards varied for core and noncore items and for item types. It was concluded that content and judgment remain important features of the standard‐setting process, and judges are not simply adding or subtracting a constant to the performance data they are given.