Some reasons for disagreement among scorers of infant intelligence test items

Abstract
Six videotapes covering 1406 administrations of items from the Griffiths Scales of Mental Development were made, and each 1 was shown to a different panel of 9 or 10 observers. Each person independently scored the items and made comments on administration and scoring. Reasons for disagreements among scorers were considered in terms of these comments and the characteristics of individuals and groups participating. Disagreement was particularly high for items from the Baby Scales. This was at least partly due to use of mothers'' reports. Mothers'' reports present a particular problem when elicited on occasions for which they are not recommended by the manual. Across all ages disagreement was related to difficulties with both administrative and scoring criteria. The present study was designed to be sensitive to a range of sources of unreliability and the findings may have relevance for other infant assessment procedures.

This publication has 23 references indexed in Scilit: