An Application of Hierarchical Kappa-type Statistics in the Assessment of Majority Agreement among Multiple Observers

Abstract
This paper presents a general statistical methodology for the analysis of multivariate categorical data involving agreement among more than two observers. Since these situations give rise to very large contingency tables in which most of the observed cell frequencies are zero, procedures based on indicator variables of the raw data for individual subjects are used to generate first-order margins and main diagonal sums from the conceptual multidimensional contingency table. From these quantities, estimates are generated to reflect the strength of an internal majority decision on each subject. Moreover, a subset of observers who demonstrate a high level of interobserver agreement can be identified by using pairwise agreement statistics between each observer and the internal majority standard opinion on each subject. These procedures are all illustrated within the context of a clinical diagnosis example involving seven pathologists.