An Essay on Measurement and Factorial Invariance

Abstract
Analysis of subgroups such as different ethnic, language, or education groups selected from among a parent population is common in health disparities research. One goal of such analyses is to examine measurement equivalence, which includes both qualitative review of the meaning of items as well as quantitative examination of different levels of factorial invariance and differential item functioning. The purpose of this essay is to review the definitions and assumptions associated with factorial invariance, placing this formulation in the context of bias, fairness, and equity. The connection between the concepts of factorial invariance and item bias (differential item functioning) using a variant of item response theory is discussed. The situations under which different forms of invariance (weak, strong, and strict) are required are discussed. Establishing factorial invariance involves a hierarchy of levels that include tests of weak, strong, and strict invariance. Pattern (metric or weak) factorial invariance implies that the regression slopes are invariant across groups. Pattern invariance requires only invariant factor loadings. Strong factorial invariance implies that the conditional expectation of the response, given the common and specific factors, is invariant across groups. Strong factorial invariance requires that specific factor means (represented as invariant intercepts) also be identical across groups. Strict factorial invariance implies that, in addition, the conditional variance of the response, given the common and specific factors, is invariant across groups. Strict factorial invariance requires that, in addition to equal factor loadings and intercepts, the residual (specific factor plus error variable) variances are equivalent across groups. The concept of measurement invariance that is most closely aligned to that of item response theory considers the latent variable as a common factor measured by manifest variables; the specific factors can be characterized as nuisance variables. Invariance of factor loadings across studied groups is required for valid comparisons of scale score or latent variable means. Strong and strict invariance may be less important in the context of basic research in which group differences in specific factors are indicative of individual differences that are important for scientific exploration. However, for most applications in which the aim is to ensure fairness and equity, strict factorial invariance is required. Health disparities research often focuses on self-reported clinical outcomes such as quality of life that are not observed directly. Latent variable models such as factor analyses are central to establishing valid assessment of such outcomes.