Kappa coefficients in medical research
Top Cited Papers
- 19 June 2002
- journal article
- review article
- Published by Wiley in Statistics in Medicine
- Vol. 21 (14), 2109-2129
- https://doi.org/10.1002/sim.1180
Abstract
Kappa coefficients are measures of correlation between categorical variables often used as reliability or validity coefficients. We recapitulate development and definitions of the K (categories) by M (ratings) kappas (K×M), discuss what they are well‐ or ill‐designed to do, and summarize where kappas now stand with regard to their application in medical research. The 2×M(M⩾2) intraclass kappa seems the ideal measure of binary reliability; a 2×2 weighted kappa is an excellent choice, though not a unique one, as a validity measure. For both the intraclass and weighted kappas, we address continuing problems with kappas. There are serious problems with using the K×M intraclass (K>2) or the various K×M weighted kappas for K>2 or M>2 in any context, either because they convey incomplete and possibly misleading information, or because other approaches are preferable to their use. We illustrate the use of the recommended kappas with applications in medical research. Copyright © 2002 John Wiley & Sons, Ltd.Keywords
This publication has 68 references indexed in Scilit:
- Interval estimation for Cohen's kappa as a measure of agreementStatistics in Medicine, 2000
- Hypothesis Testing and Effect Size Estimation in Clinical TrialsAnnals of Allergy, Asthma & Immunology, 1997
- MISINTERPRETATION AND MISUSE OF THE KAPPA STATISTICAmerican Journal of Epidemiology, 1987
- A bibliography of publications on observer variabilityJournal of Chronic Diseases, 1985
- Modeling Agreement among RatersJournal of the American Statistical Association, 1985
- Coefficient Kappa: Some Uses, Misuses, and AlternativesEducational and Psychological Measurement, 1981
- The Equivalence of Weighted Kappa and the Intraclass Correlation Coefficient as Measures of ReliabilityEducational and Psychological Measurement, 1973
- Measures of response agreement for qualitative data: Some generalizations and alternatives.Psychological Bulletin, 1971
- A Coefficient of Agreement for Nominal ScalesEducational and Psychological Measurement, 1960