Inter-observer agreement in the assessment of endoscopic findings in ulcerative colitis

Abstract
Endoscopic findings are essential in evaluating the disease activity in ulcerative colitis. The aim of this study was to evaluate how endoscopists assess individual endoscopic features of mucosal inflammation in ulcerative colitis, the inter-observer agreement, and the importance of the observers' experience. Five video clips of ulcerative colitis were shown to a group of experienced and a group of inexperienced endoscopists. Both groups were asked to assess eight endoscopic features and the overall mucosal inflammation on a visual analogue scale. The following statistical analyses were used; Contingency tables analysis, kappa analysis, analysis of variance, Pearson linear correlation analysis, general linear models, and agreement analysis. All tests were carried out two-tailed, with a significance level of 5%. The inter-observer agreement ranged from very good to moderate in the experienced group and from very good to fair in the inexperienced group. There was a significantly better inter-observer agreement in the experienced group in the rating of 6 out of 9 features (p < 0.05). The experienced and inexperienced endoscopists scored the "ulcerations" significantly different. (p = 0.05). The inter-observer variation of the mean score of "erosions", "ulcerations" and endoscopic activity index in mild disease, and the scoring of "erythema" and "oedema" in moderate-severe disease was significantly higher in the inexperienced group. A correlation was seen between all the observed endoscopic features in both groups of endoscopists. Among experienced endoscopists, a set of four endoscopic variables ("Vascular pattern", "Erosions", "Ulcerations" and Friability") explained 92% of the variation in EAI. By including "Granularity" in these set 91% of the variation in EAI was explained in the group of inexperienced endoscopists. The inter-observer agreement in the rating of endoscopic features characterising ulcerative colitis is satisfactory in both groups of endoscopists but significantly higher in the experienced group. The difference in the mean score between the two groups is only significant for "ulcerations". The endoscopic variables "Vascular pattern", "Erosions", "Ulcerations" and Friability" explained the overall endoscopic activity index. Even though the present result is quite satisfactory, there is a potential of improvement. Improved grading systems might contribute to improve the consistency of endoscopic descriptions.