Abstract
For tests in which items individually score 0 or 1, the commonly described 'U-L Index' compares, for any one item, the numbers of correct responses given by two groups of examinees, those with the highest and lowest scores. U-L Indices can be very unreliable if not based on large numbers of test results and, being much dependent on the spread of total scores, cannot generally be compared from test to test. Coefficients of correlation between item scores and total scores are more reliable and are nowadays simpler to obtain, but they have similar defects. They are therefore no substitute for careful scrutiny of item wording. Properties of the indices, including the effects of guessing and of question complexity, are explored and illustrated through simple computer-processed models that are advocated as investigative and learning tools.

This publication has 4 references indexed in Scilit: