Automatic Structuring of Radiology Free-Text Reports

Abstract
A natural language processor was developed that automatically structures the important medical information (eg, the existence, properties, location, and diagnostic interpretation of findings) contained in a radiology free-text document as a formal information model that can be interpreted by a computer program. The input to the system is a free-text report from a radiologic study. The system requires no reporting style changes on the part of the radiologist. Statistical and machine learning methods are used extensively throughout the system. A graphical user interface has been developed that allows the creation of hand-tagged training examples. Various aspects of the difficult problem of implementing an automated structured reporting system have been addressed, and the relevant technology is progressing well. Extensible Markup Language is emerging as the preferred syntactic standard for representing and distributing these structured reports within a clinical environment. Early successes hold out hope that similar statistically based models of language will allow deep understanding of textual reports. The success of these statistical methods will depend on the availability of large numbers of high-quality training examples for each radiologic subdomain. The acceptability of automated structured reporting systems will ultimately depend on the results of comprehensive evaluations.

This publication has 22 references indexed in Scilit: