Univariate Two-Population Distribution-free Discrimination

1 December 1954

journal article
research article
Published by JSTOR in Journal of the American Statistical Association

Vol. 49 (268), 770
https://doi.org/10.2307/2281538

Abstract

A distribution-free procedure for classifying a univariate random variable, z, into one of two populations on the basis of a sample of size N, in which m members are classified into one population and the remaining (N – m) into the other, is given as follows: Let t(z) = k(z) – h(z), where k(z) is the number of observations from the first population which are less than z and h(z) is similarly defined for the second population. If z ≦ ζ*, where ζ* is that value of z for which t(z) is a maximum, classify z into the first population, otherwise into the second. The probability of correct classification, and its estimate, [N – m + t(ζ*)]/N, both converge in probability to the maximum attainable probability of correct classification.

Keywords

Cited by 8 articles