Use of Natural Language Processing to Translate Clinical Information from a Database of 889,921 Chest Radiographic Reports

Abstract
PURPOSE: To evaluate translation of chest radiographic reports by using natural language processing and to compare the findings with those in the literature. MATERIALS AND METHODS: A natural language processor coded 10 years of narrative chest radiographic reports from an urban academic medical center. Coding for 150 reports was compared with manual coding. Frequencies and co-occurrences of 24 clinical conditions (diseases, abnormalities, and clinical states) were estimated. The ratio of right to left lung mass, association of pleural effusion with other conditions, and frequency of bullet and stab wounds were compared with independent observations. The sensitivity and specificity of the system’s pneumothorax coding were compared with those of manual financial coding. RESULTS: The system coded 889,921 reports on 251,186 patients. On the basis of manual coding of 150 reports, the processor’s sensitivity (0.81) and specificity (0.99) were comparable to those previously reported for natural language processi...