Coefficient Kappa: Some Uses, Misuses, and Alternatives
- 1 October 1981
- journal article
- research article
- Published by SAGE Publications in Educational and Psychological Measurement
- Vol. 41 (3), 687-699
- https://doi.org/10.1177/001316448104100307
Abstract
This paper considers some appropriate and inappropriate uses of coefficient kappa and alternative kappa-like statistics. Discussion is restricted to the descriptive characteristics of these statistics for measuring agreement with categorical data in studies of reliability and validity. Special consideration is given to assumptions about whether marginals are fixed a priori, or free to vary. In reliability studies, when marginals are fixed, coefficient kappa is found to be appropriate. When either or both of the marginals are free to vary, however, it is suggested that the "chance" term in kappa be replaced by 1/n, where n is the number of categories. In validity studies, we suggest considering whether one wants an index of improvement beyond "chance" or beyond the best a priori strategy employing base rates. In the former case, considerations are similar to those in reliability studies with the marginals for the criterion measure considered as fixed. In the latter case, it is suggested that the largest marginal proportion for the criterion measure be used in place of the "chance" term in kappa. Similarities and differences among these statistics are discussed and illustrated with synthetic data.Keywords
This publication has 13 references indexed in Scilit:
- Occupational daydreams as predictors of vocational plans of college womenJournal of Vocational Behavior, 1977
- Interrater reliability and agreement of subjective judgments.Journal of Counseling Psychology, 1975
- Vocational choices of men and women: A comparison of predictors from the Self-Directed Search.Journal of Counseling Psychology, 1975
- Judgment of counseling process: Reliability, agreement, and error.Psychological Bulletin, 1972
- Coding Reliability and Validity of Interview DataAmerican Sociological Review, 1971
- Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit.Psychological Bulletin, 1968
- A Coefficient of Agreement for Nominal ScalesEducational and Psychological Measurement, 1960
- Reliability of Content Analysis: The Case of Nominal Scale CodingPublic Opinion Quarterly, 1955
- Communications Through Limited Response QuestioningPublic Opinion Quarterly, 1954
- The test-retest reliability of qualitative dataPsychometrika, 1946