Abstract
Evaluation of histological diagnosis requires an index of agreement (to measure repeatability and validity) together with a method of assessing bias. Cohen's kappa statistic appears to be the most suitable tool for measuring levels of agreement, which if unsatisfactory may be caused by bias. Further study of bias is possible by examining levels of agreement for each diagnostic category or by searching for categories of disagreement in which more observations occur than would be expected by chance alone. This article gives reasons for choosing the kappa statistic, with examples illustrating its calculation and the investigation of bias.