What is the prevalence of health-related searches on the World Wide Web? Qualitative and quantitative analysis of search engine queries on the internet.

1 January 2003

journal article

Vol. 2003, 225-9

Abstract

While health information is often said to be the most sought after information on the web, empirical data on the actual frequency of health-related searches on the web are missing. In the present study we aimed to determine the prevalence of health-related searches on the web by analyzing search terms entered by people into popular search engines. We also made some preliminary attempts in qualitatively describing and classifying these searches. Occasional difficulties in determining what constitutes a "health-related" search led us to propose and validate a simple method to automatically classify a search string as "health-related". This method is based on determining the proportion of pages on the web containing the search string and the word "health", as a proportion of the total number of pages with the search string alone. Using human codings as gold standard we plotted a ROC curve and determined empirically that if this "co-occurance rate" is larger than 35%, the search string can be said to be health-related (sensitivity: 85.2%, specificity 80.4%). The results of our "human" codings of search queries determined that about 4.5% of all searches are "health-related". We estimate that globally a minimum of 6.75 Million health-related searches are being conducted on the web every day, which is roughly the same number of searches that have been conducted on the NLM Medlars system in 1996 in a full year.

This publication has 7 references indexed in Scilit:

How do consumers search for and appraise health information on the world wide web? Qualitative study using focus groups, usability tests, and in-depth interviews
BMJ, 2002
Personal health information-seeking: a qualitative review of the literature.
2001
Consumer health informatics: a consensus description and commentary from American Medical Informatics Association members.
2001
A taxonomy of generic clinical questions: classification study
BMJ, 2000
Recent advances: Consumer health informatics
BMJ, 2000
Patients Looking for Information on the Internet and Seeking Teleadvice
Archives of Dermatology, 1999
Expanding the concept of medical information: An observational study of physicians' information needs
Computers and Biomedical Research, 1992

Cited by 64 articles