Classification via Minimum Incremental Coding Length

1 January 2009

journal article
research article
Published by Society for Industrial & Applied Mathematics (SIAM) in SIAM Journal on Imaging Sciences

Vol. 2 (2), 367-395
https://doi.org/10.1137/070707312

Abstract

We present a simple new criterion for classification, based on principles from lossy data compression. The criterion assigns a test sample to the class that uses the minimum number of additional bits to code the test sample, subject to an allowable distortion. We demonstrate the asymptotic optimality of this criterion for Gaussian distributions and analyze its relationships to classical classifiers. The theoretical results clarify the connections between our approach and popular classifiers such as maximum a posteriori (MAP), regularized discriminant analysis (RDA), k-nearest neighbor (k-NN), and support vector machine (SVM), as well as unsupervised methods based on lossy coding. Our formulation induces several good effects on the resulting classifier. First, minimizing the lossy coding length induces a regularization effect which stabilizes the (implicit) density estimate in a small sample setting. Second, compression provides a uniform means of handling classes of varying dimension. The new criterion and its kernel and local versions perform competitively on synthetic examples, as well as on real imagery data such as handwritten digits and face images. On these problems, the performance of our simple classifier approaches the best reported results, without using domain-specific information. All MATLAB code and classification results are publicly available for peer evaluation at http://perception.csl.uiuc.edu/coding/home.htm.

Keywords

This publication has 20 references indexed in Scilit:

LIBSVM
ACM Transactions on Intelligent Systems and Technology, 2011
Regularized estimation of large covariance matrices
The Annals of Statistics, 2008
A unified framework for subspace face recognition
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004
Lambertian reflectance and linear subspaces
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003
From few to many: illumination cone models for face recognition under variable lighting and pose
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001
The minimum description length principle in coding and modeling
IEEE Transactions on Information Theory, 1998
Gradient-based learning applied to document recognition
Proceedings of the IEEE, 1998
Eigenfaces vs. Fisherfaces: recognition using class specific linear projection
IEEE Transactions on Pattern Analysis and Machine Intelligence, 1997
Eigenfaces for Recognition
Journal of Cognitive Neuroscience, 1991
Regularized Discriminant Analysis
Journal of the American Statistical Association, 1989

Cited by 9 articles