A Fast Learning Algorithm for Deep Belief Nets

Top Cited Papers

1 July 2006

journal article
Published by MIT Press in Neural Computation

Vol. 18 (7), 1527-1554
https://doi.org/10.1162/neco.2006.18.7.1527

Abstract

We show how to use "complementary priors" to eliminate the explaining-away effects that make inference difficult in densely connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive version of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribution of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning algorithms. The low-dimensional manifolds on which the digits lie are modeled by long ravines in the free-energy landscape of the top-level associative memory, and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind.

Keywords

This publication has 13 references indexed in Scilit:

Hierarchical Bayesian inference in the visual cortex
Journal of the Optical Society of America A, 2003
Shape matching and object recognition using shape contexts
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Recognizing handwritten digits using hierarchical products of experts
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002
Training Products of Experts by Minimizing Contrastive Divergence
Neural Computation, 2002
10.1162/jmlr.2003.4.7-8.1235
Applied Physics Letters, 2000
Gradient-based learning applied to document recognition
Proceedings of the IEEE, 1998
Boosting a Weak Learning Algorithm by Majority
Information and Computation, 1995
The "Wake-Sleep" Algorithm for Unsupervised Neural Networks
Science, 1995
Connectionist learning of belief networks
Artificial Intelligence, 1992
Projection Pursuit Regression
Journal of the American Statistical Association, 1981

Cited by 11639 articles