Quality of Data in Perinatal Population Health Databases

Abstract
Background: Administrative or population health datasets (PHDS) are increasingly being used for research related to maternal and infant health. However, the accuracy and completeness of the information in the PHDS is important to ensure validity of the results of this research. Objective: To compile and review studies that validate the reporting of conditions and procedures related to pregnancy, childbirth, and newborns and provide a tool of reference for researchers. Methods: A systematic search was conducted of Medline and EMBASE databases to find studies that validated routinely collected datasets containing diagnoses and procedures related to pregnancy, childbirth, and newborns. To be included datasets had to be validated against a gold standard, such as review of medical records, maternal interview or survey, specialized register, or laboratory data. Sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and/or kappa statistic for each diagnosis or procedure code were calculated. Results: Forty-three validation studies were included. Under-enumeration was common, with the level of ascertainment increasing as time from diagnosis/procedure to birth decreased. Most conditions and procedures had high specificities indicating few false positives, and procedures were more accurately reported than diagnoses. Hospital discharge data were generally more accurate than birth data, however identifying cases from more than 1 dataset further increased ascertainment. Conclusions: This comprehensive collection of validation studies summarizing the quality of perinatal population data will be an invaluable resource to all researchers working with PHDS.

This publication has 29 references indexed in Scilit: