Feature Selection for Automatic Classification of Non-Gaussian Data

Abstract
A computer-based technique for automatic selection of features for the classification of non-Gaussian data is presented. The selection technique exploits interactive cluster finding and a modified branch and bound optimization of piecewise linear classifiers. The technique first finds an efficient set of pairs of oppositely classified clusters to represent the data. Then a zero-one implicit enumeration implements a branch and bound search for a good subset of features. A test of the feature selection technique on multidimensional synthetic and real data yielded close-to-optimum, and in many cases optimum, subsets of features. The real data consisted of a) 1284 12-dimensional feature vectors representing normal and abnormal breast tissue, extracted from X-ray mammograms, and b) 1060 30-dimensional feature vectors representing tanks and clutter in infrared video images.

This publication has 23 references indexed in Scilit: