Scalable and accurate deep learning with electronic health records
Top Cited Papers
Open Access
- 8 May 2018
- journal article
- research article
- Published by Springer Science and Business Media LLC in npj Digital Medicine
- Vol. 1 (1), 1-10
- https://doi.org/10.1038/s41746-018-0029-1
Abstract
Predictive modeling with electronic health record (EHR) data is anticipated to drive personalized medicine and improve healthcare quality. Constructing predictive statistical models typically requires extraction of curated predictor variables from normalized EHR data, a labor-intensive process that discards the vast majority of information in each patient’s record. We propose a representation of patients’ entire raw EHR records based on the Fast Healthcare Interoperability Resources (FHIR) format. We demonstrate that deep learning methods using this representation are capable of accurately predicting multiple medical events from multiple centers without site-specific data harmonization. We validated our approach using de-identified EHR data from two US academic medical centers with 216,221 adult patients hospitalized for at least 24 h. In the sequential format we propose, this volume of EHR data unrolled into a total of 46,864,534,945 data points, including clinical notes. Deep learning models achieved high accuracy for tasks such as predicting: in-hospital mortality (area under the receiver operator curve [AUROC] across sites 0.93–0.94), 30-day unplanned readmission (AUROC 0.75–0.76), prolonged length of stay (AUROC 0.85–0.86), and all of a patient’s final discharge diagnoses (frequency-weighted AUROC 0.90). These models outperformed traditional, clinically-used predictive models in all cases. We believe that this approach can be used to create accurate and scalable predictions for a variety of clinical scenarios. In a case study of a particular prediction, we demonstrate that neural networks can be used to identify relevant information from the patient’s chart.Keywords
This publication has 70 references indexed in Scilit:
- Development and validation of a continuous measure of patient condition using the Electronic Medical RecordJournal of Biomedical Informatics, 2013
- A Public-Private Partnership Develops and Externally Validates a 30-Day Hospital Readmission Risk Prediction ModelOnline Journal of Public Health Informatics, 2013
- Sustained effectiveness of a primary-team–based rapid response system*Critical Care Medicine, 2012
- Risk Prediction Models for Hospital ReadmissionJAMA, 2011
- A Scoring System to Predict Readmission of Patients With Acute Pancreatitis to the Hospital Within Thirty Days of DischargeClinical Gastroenterology and Hepatology, 2011
- Derivation and validation of an index to predict early death or unplanned readmission after discharge from hospital to the communityCMAJ : Canadian Medical Association Journal, 2010
- Ensemble-based classifiersArtificial Intelligence Review, 2009
- Modifying ICD-9-CM Coding of Secondary Diagnoses to Improve Risk-Adjustment of Inpatient Mortality RatesMedical Decision Making, 2009
- Risk-Adjusting Hospital Inpatient Mortality Using Automated Inpatient, Outpatient, and Laboratory DatabasesMedical Care, 2008
- APACHE IICritical Care Medicine, 1985