Abstract
Portfolio assessment, that is, the evaluation of performance by means of a cumulative collection of student work, has figured prominently in recent US debate about education reform. Proponents hope not only to broaden measurement of performance, but also to use portfolio assessment to encourage improved instruction. Although portfolio assessment has sparked considerable attention and enthusiasm, it has been incorporated into only a few of the nearly ubiquitous large‐scale external assessment programmes in the US. This paper evaluates the quality of the performance data produced by several large‐scale portfolio efforts. Evaluations of reliability, which have focused primarily on the consistency of scoring, have yielded highly variable results. While high levels of consistency have been reached in some cases, scoring has been quite inconsistent in others, to the point of severely limiting the utility of scores. Information about other aspects of validity is more limited and generally discouraging. For example, scores from portfolio assessments often do not show anticipated relationships with other achievement data, and teachers report practices in the implementation of portfolio assessment that are appropriate for instructional purposes but threaten the validity of inferences from portfolio scores. While other studies show positive effects of portfolio programmes (see Stecher, this issue), these findings suggest that portfolio assessment at its current state of development is problematic for many of the uses to which large‐scale external assessments are now put in the US.

This publication has 7 references indexed in Scilit: