Using mutual information for selecting features in supervised neural net learning

1 July 1994

journal article
Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Neural Networks

Vol. 5 (4), 537-550
https://doi.org/10.1109/72.298224

Abstract

This paper investigates the application of the mutual information criterion to evaluate a set of candidate features and to select an informative subset to be used as input data for a neural network classifier. Because the mutual information measures arbitrary dependencies between random variables, it is suitable for assessing the "information content" of features in complex classification tasks, where methods bases on linear relations (like the correlation) are prone to mistakes. The fact that the mutual information is independent of the coordinates chosen permits a robust estimation. Nonetheless, the use of the mutual information for tasks characterized by high input dimensionality requires suitable approximations because of the prohibitive demands on computation and samples. An algorithm is proposed that is based on a "greedy" selection of the features and that takes both the mutual information with respect to the output class and with respect to the already-selected features into account. Finally the results of a series of experiments are discussed.

Keywords

This publication has 13 references indexed in Scilit:

Using the Karhunen-Loe've transformation in the back-propagation training algorithm
IEEE Transactions on Neural Networks, 1991
The self-organizing map
Proceedings of the IEEE, 1990
Mutual information functions versus correlation functions
Journal of Statistical Physics, 1990
A simple procedure for pruning back-propagation trained neural networks
IEEE Transactions on Neural Networks, 1990
Boolean Feature Discovery in Empirical Learning
Machine Learning, 1990
How to Generate Ordered Maps by Maximizing the Mutual Information between Input and Output Signals
Neural Computation, 1989
Reconstructing attractors from scalar time series: A comparison of singular system and redundancy criteria
Physica D: Nonlinear Phenomena, 1989
Analysis of hidden units in a layered network trained to classify sonar targets
Neural Networks, 1988
Independent coordinates for strange attractors from mutual information
Physical Review A, 1986
THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS
Annals of Eugenics, 1936

Cited by 1876 articles