Classification of thoracic and lumbar spine fractures: problems of reproducibility

Abstract
Reproducibility of fracture classification systems in general has been a matter of controversy. The reproducibility of spinal fracture classifications has not been sufficiently studied. We studied the inter-observer and intra-observer reproducibility of the Magerl (AO) classification using radiograms, CTs and MRIs of 53 patients. We compared this classification with the older and simpler Denis classification. Five observers classified the fractures, first using the radiograms and CTs and, 6 weeks later, with radiograms and MRIs. Three of the observers repeated the readings after 3 months. Three observers also classified the fractures according to Denis. Agreement was measured using Cohen's κ test. The type (A, B, C) classification of the AO system was fairly reproducible with CTs. With MRI this was only moderate. Group subclassification of the types yielded higher κ values, corresponding to substantial agreement. The agreement was, in general, better with the Denis classification, but the variance was higher due to the difficulty of finding proper categories for some injury patterns. Although the AO classification allows proper registration of all kinds of injury, the reproducibility, especially at the type level, is problematic. Use of MRI and better definition of the distinctive properties of the three different types may enhance the reproducibility of the scheme.