Positional Accuracy of Geocoded Addresses in Epidemiologic Research

Abstract
Background Geographic information systems (GIS) offer powerful techniques for epidemiologists. Geocoding is an important step in the use of GIS in epidemiologic research, and the validity of epidemiologic studies using this methodology depends, in part, on the positional accuracy of the geocoding process. Methods We conducted a study comparing the validity of positions geocoded with a commercially available program to positions determined by Global Positioning System (GPS) satellite receivers. Addresses (N = 200) were randomly selected from a recently completed case–control study in Western New York State. We geocoded addresses using ArcView 3.2 on the GDT Dynamap/2000 U.S. Street database. In addition, we measured the longitude and latitude of these addresses with a GPS receiver. The distance between the locations obtained by these two methods was calculated for all addresses. Results The distance between the geocoded point and the GPS point was within 100 m for the majority of subject addresses (79%), with only a small proportion (3%) having a distance greater than 800 m. The overall median distance between GPS points and geocoded points was 38 m (90% confidence interval [CI] = 34–46). Distances were not different for cases and controls. Urban addresses (median = 32 m; CI = 28–37) were slightly more accurate than nonurban addresses (median = 52 m; CI = 44–61). Conclusions. This study indicates that the suitability of geocoding for epidemiologic research depends on the level of spatial resolution required to assess exposure. Although sources of error in positional accuracy for geocoded addresses exist, geocoding of addresses is, for the most part, very accurate.