Examining the Measurement Quality of Tests Containing Differentially Functioning Items: Do Biased Items Result in Poor Measurement?

1 April 1999

journal article
research article
Published by SAGE Publications in Educational and Psychological Measurement

Vol. 59 (2), 248-269
https://doi.org/10.1177/00131649921969839

Abstract

This study investigated effects of retaining test items manifesting differential item functioning (DIF) on aspects of the measurement quality and validity of that test’s scores. DIF was evaluated using the Mantel-Haenszel procedure, which allows one to detect items that function differently in two groups of examinees at constant levels of the trait. Multiple composites of DIF-and non-DIF-containing items were created to examine the impact of DIF on the measurement, validity, and predictive relations involving those composites. Criteria used were the American College Testing composite, the Scholastic Aptitude Test (SAT) verbal (SATV), quantitative (SATQ), composite (SATC), and grade point average rank percentile. Results indicate measurement quality of tests is not seriously degraded when items manifesting DIF are retained, even when number of items in the compared composites has been controlled. Implications of results are discussed within the framework of multiple determinants of item responses.

Keywords

This publication has 10 references indexed in Scilit:

A Didactic Explanation of Item Bias, Item Impact, and Item Validity From a Multidimensional Perspective
Journal of Educational Measurement, 1992
Use of tests manifesting sex differences as measures of intelligence: Implications for measurement bias.
Journal of Applied Psychology, 1987
Study of the measurement bias of two standardized psychological tests.
Journal of Applied Psychology, 1987
An analysis and evaluation of test and item bias in the prediction context.
Journal of Applied Psychology, 1986
Behavioral development and construct validity: The principle of aggregation.
Psychological Bulletin, 1983
Biased test items and differential validity.
Psychological Bulletin, 1982
STATISTICAL PROCEDURES FOR IDENTIFYING POSSIBLE SOURCES OF ITEM BIAS BASED ON x² STATISTICS
Journal of Educational Measurement, 1981
Item Bias in a Test of Reading Comprehension
Applied Psychological Measurement, 1981
A METHOD OF ASSESSING BIAS IN TEST ITEMS
Journal of Educational Measurement, 1979
The organization of human abilities.
American Psychologist, 1962

Cited by 31 articles