Identifying Multiple Submissions in Internet Research: Preserving Data Integrity

Abstract
Internet-based sexuality research with hidden populations has become increasingly popular. Respondent anonymity may encourage participation and lower social desirability, but associated disinhibition may promote multiple submissions, especially when incentives are offered. The goal of this study was to identify the usefulness of different variables for detecting multiple submissions from repeat responders and to explore incentive effects. The data included 1,900 submissions from a three-session Internet intervention with a pretest and three post-test questionnaires. Participants were men who have sex with men and incentives were offered to rural participants for completing each questionnaire. The final number of submissions included 1,273 “unique”, 132 first submissions by “repeat responders” and 495 additional submissions by the “repeat responders” (N = 1,900). Four categories of repeat responders were identified: “infrequent” (2–5 submissions), “persistent” (6–10 submissions), “very persistent” (11–30 submissions), and “hackers” (more than 30 submissions). Internet Provider (IP) addresses, user names, and passwords were the most useful for identifying “infrequent” repeat responders. “Hackers” often varied their IP address and identifying information to prevent easy identification, but investigating the data for small variations in IP, using reverse telephone look up, and patterns across usernames and passwords were helpful. Incentives appeared to play a role in stimulating multiple submissions, especially from the more sophisticated “hackers”. Finally, the web is ever evolving and it will be necessary to have good programmers and staff who evolve as fast as “hackers”.