Inflation of the type I error rate when a continuous confounding variable is categorized in logistic regression analyses

Abstract
This paper demonstrates an inflation of the type I error rate that occurs when testing the statistical significance of a continuous risk factor after adjusting for a correlated continuous confounding variable that has been divided into a categorical variable. We used Monte Carlo simulation methods to assess the inflation of the type I error rate when testing the statistical significance of a risk factor after adjusting for a continuous confounding variable that has been divided into categories. We found that the inflation of the type I error rate increases with increasing sample size, as the correlation between the risk factor and the confounding variable increases, and with a decrease in the number of categories into which the confounder is divided. Even when the confounder is divided in a five‐level categorical variable, the inflation of the type I error rate remained high when both the sample size and the correlation between the risk factor and the confounder were high. Copyright 2004 John Wiley & Sons, Ltd.