Nonparametric Bayes error estimation using unclassified samples
- 1 July 1973
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Information Theory
- Vol. 19 (4), 434-440
- https://doi.org/10.1109/tit.1973.1055049
Abstract
A new nonparametric method of estimating the Bayes risk using an unclassified test sample set as well as a classified design sample set is introduced. The classified design set is used to obtain nonparametric estimates of the conditional Bayes risk of classification at each point of the unclassified test set. The average of these risk estimates is the error estimate. For large numbers of design samples the new error estimate has less variance than does an error-count estimate for classified test samples using the optimum Bayes classifier. The first application of the nonparametric method usesk-nearest neighbor (k-NN) estimates of the posterior probabilities to form the risk estimate. A large-sample analysis is made of this estimate. The expected value of this estimate is shown to be a lower bound on the Bayes error. A simple modification provides unbiased estimates of thek-NN classification error, thus providing an upper bound on the Bayes error. The second application of the method uses Parzen approximation of the density functions to obtain estimates of the risk and subsequently the Bayes error. Results of experiments on simulated data illustrate the small-sample behavior.Keywords
This publication has 12 references indexed in Scilit:
- Application of optimum error-reject functions (Corresp.)IEEE Transactions on Information Theory, 1972
- Estimation of Classification ErrorIEEE Transactions on Computers, 1971
- Convergence of the nearest neighbor ruleIEEE Transactions on Information Theory, 1971
- Nonparametric Bayes-risk estimationIEEE Transactions on Information Theory, 1971
- The Nearest Neighbor Classification Rule with a Reject OptionIEEE Transactions on Systems Science and Cybernetics, 1970
- On optimum recognition error and reject tradeoffIEEE Transactions on Information Theory, 1970
- Some convergence properties of a nearest neighbor decision ruleIEEE Transactions on Information Theory, 1970
- Nearest neighbor pattern classificationIEEE Transactions on Information Theory, 1967
- Estimation of a multivariate densityAnnals of the Institute of Statistical Mathematics, 1966
- On Estimation of a Probability Density Function and ModeThe Annals of Mathematical Statistics, 1962