Abstract
Type I error rates of the likelihood ratio test for the detection of differential item functioning (DIF) were investigated using monte carlo simulations. The graded response model with five ordered categories was used to generate datasets of a 30-item test for samples of 300 and 1,000 simulated examinees. All DIF comparisons were simulated by randomly pairing two groups of examinees. Three different sample size combinations of reference and focal groups were simulated under two ability matching conditions. For each of the six combinations of sample sizes by ability matching conditions, 100 replications of DIF detection comparisons were simulated. Item parameter estimates and likelihood values were obtained by marginal maximum likelihood estimation using the computer program MULTILOG. Irpe I eryor rates of the likelihood ratio test statistics for all six combinations of sample sizes and ability matching conditions were within theoretically expected values at each of the nominal alpha levels considered.