Beyond kappa: A review of interrater agreement measures
- 1 March 1999
- journal article
- review article
- Published by Wiley in The Canadian Journal of Statistics / La Revue Canadienne de Statistique
- Vol. 27 (1), 3-23
- https://doi.org/10.2307/3315487
Abstract
In 1960, Cohen introduced the kappa coefficient to measure chance‐corrected nominal scale agreement between two raters. Since then, numerous extensions and generalizations of this interrater agreement measure have been proposed in the literature. This paper reviews and critiques various approaches to the study of interrater agreement, for which the relevant data comprise either nominal or ordinal categorical ratings from multiple raters. It presents a comprehensive compilation of the main statistical approaches to this problem, descriptions and characterizations of the underlying models, and discussions of related statistical methodologies for estimation and confidence‐interval construction. The emphasis is on various practical scenarios and designs that underlie the development of these measures, and the interrelationships between them.Keywords
This publication has 66 references indexed in Scilit:
- Maximum likelihood estimation of the kappa coefficient from models of matched binary responsesStatistics in Medicine, 1995
- Latent class analysis of diagnostic agreementStatistics in Medicine, 1990
- MISINTERPRETATION AND MISUSE OF THE KAPPA STATISTICAmerican Journal of Epidemiology, 1987
- Longitudinal data analysis using generalized linear modelsBiometrika, 1986
- Modeling Agreement among RatersJournal of the American Statistical Association, 1985
- Simple Models for the Analysis of Association in Cross-Classifications Having Ordered CategoriesJournal of the American Statistical Association, 1979
- The Equivalence of Weighted Kappa and the Intraclass Correlation Coefficient as Measures of ReliabilityEducational and Psychological Measurement, 1973
- Measures of response agreement for qualitative data: Some generalizations and alternatives.Psychological Bulletin, 1971
- A Coefficient of Agreement for Nominal ScalesEducational and Psychological Measurement, 1960