Abstract
The primary goal of any adaptive system that learns by example is to generalize from the training examples to novel inputs. The backpropagation learning algorithm is popular for its simplicity and landmark cases of generalization. It has been observed that backpropagation networks sometimes generalize better when they contain a hidden layer that has considerably fewer units than previous layers. The functional properties of such hidden-layer bottlenecks are analyzed, and a method for dynamically creating them, concurrent with backpropagation learning, is described. The method does not excise hidden units; rather, it compresses the dimensionality of the space spanned by the hidden-unit weight vectors and forms clusters of weight vectors in the low-dimensional space. The result is a functional bottleneck distributed across many units. The method is a gradient descent procedure, using local computations on simple lateral Hebbian connections between hidden units.

This publication has 9 references indexed in Scilit: