Do Item-discrimination Indices Really Help Us to Improve Our Tests?

1 July 2001

journal article
research article
Published by Taylor & Francis in Assessment & Evaluation in Higher Education

Vol. 26 (3), 213-220
https://doi.org/10.1080/02602930120052378

Abstract

For tests in which items individually score 0 or 1, the commonly described 'U-L Index' compares, for any one item, the numbers of correct responses given by two groups of examinees, those with the highest and lowest scores. U-L Indices can be very unreliable if not based on large numbers of test results and, being much dependent on the spread of total scores, cannot generally be compared from test to test. Coefficients of correlation between item scores and total scores are more reliable and are nowadays simpler to obtain, but they have similar defects. They are therefore no substitute for careful scrutiny of item wording. Properties of the indices, including the effects of guessing and of question complexity, are explored and illustrated through simple computer-processed models that are advocated as investigative and learning tools.

Keywords

This publication has 4 references indexed in Scilit:

THE RELATION OF ITEM DISCRIMINATION TO TEST RELIABILITY¹
Journal of Educational Measurement, 1967
A COMPARISON OF SEVERAL ITEM DISCRIMINATION INDICES¹
Journal of Educational Measurement, 1965
Notes on a suggested index of item validity: The U-L Index.
Journal of Educational Psychology, 1951
The selection of upper and lower groups for the validation of test items.
Journal of Educational Psychology, 1939

Cited by 15 articles