Guided Multiple Imputation of Missing Data
- 1 March 2007
- journal article
- Published by Wolters Kluwer Health in Epidemiology
- Vol. 18 (2), 246-252
- https://doi.org/10.1097/01.ede.0000254708.40228.8b
Abstract
Multiple imputation can be a good solution to handling missing data if data are missing at random. However, this assumption is often difficult to verify. We describe an application of multiple imputation that makes this assumption plausible. This procedure requires contacting a random sample of subjects with incomplete data to fill in the missing information, and then adjusting the imputation model to incorporate the new data. Simulations with missing data that were decidedly not missing at random showed, as expected, that the method restored the original beta coefficients, whereas other methods of dealing with missing data failed. Using a dataset with real missing data, we found that different approaches to imputation produced moderately different results. Simulations suggest that filling in 10% of data that was initially missing is sufficient for imputation in many epidemiologic applications, and should produce approximately unbiased results, provided there is a high response on follow-up from the subsample of those with some originally missing data. This response can probably be achieved if this data collection is planned as an initial approach to dealing with the missing data, rather than at later stages, after further attempts that leave only data that is very difficult to complete.Keywords
This publication has 15 references indexed in Scilit:
- Diet-Associated Risks of Disease and Self-Reported Food Consumption: How Shall We Treat Partial Nonresponse in a Food Frequency Questionnaire?Nutrition and Cancer, 2000
- Response Rate to Mailed Epidemiologic Questionnaires: A Population-based Randomized Trial of Variations in Design and Mailing RoutinesAmerican Journal of Epidemiology, 1998
- Reproducibility and relative validity of food group intake in a food frequency questionnaire developed for the German part of the EPIC project. European Prospective Investigation into Cancer and NutritionInternational Journal of Epidemiology, 1997
- A Critical Look at Methods for Handling Missing Covariates in Epidemiologic Regression AnalysesAmerican Journal of Epidemiology, 1995
- Multiple Imputation in Mixture Models for Nonignorable Nonresponse with Follow-upsJournal of the American Statistical Association, 1993
- An Introduction to the BootstrapPublished by Springer Nature ,1993
- MAILED DIETARY SURVEYSEpidemiology, 1991
- Biased Estimation of the Odds Ratio in Case-Control Studies due to the Use of Ad Hoc Methods of Correcting for Missing Values for Confounding VariablesAmerican Journal of Epidemiology, 1991
- Chronic disease among seventh-day adventists, a low-risk group. Rationale, methodology, and description of the populationCancer, 1989
- Inference and missing dataBiometrika, 1976