Editing Data: What Difference Do Consistency Checks Make?

Abstract
In 1998, the Florida Department of Health undertook a self-administered school-based survey of tobacco use, attitudes, and behaviors among nearly 23, 000 public school students in grades 6–12. The survey design did not use skip patterns; therefore, students had multiple opportunities to contradict themselves. By using examples from the high school portion (grades 9–12) of the survey, the authors examined five possible approaches to handling data inconsistencies and the effect that each has on point estimates. Use of these approaches resulted in point estimates of current cigarette use ranging from 25.6% to 29.7%. The number of missing respondents varied from 33 (less than 1%) to 1, 374 (13%), depending on which approach was used. After stratification by gender and race, the prevalence estimates changed marginally for girls but strikingly for boys. Non-Hispanic White students were substantially more likely than non-Hispanic Black students to report current cigarette use, but the magnitude of this difference varied significantly according to the analytical approach used. The approach used to check data consistency may influence point estimates and comparability with other studies. Therefore, this issue should be addressed when findings are reported. Am J Epidemiol 2000; 151: 921-6.