Learning to Represent Spatial Transformations with Factored Higher-Order Boltzmann Machines
- 1 June 2010
- journal article
- Published by MIT Press in Neural Computation
- Vol. 22 (6), 1473-1492
- https://doi.org/10.1162/neco.2010.01-09-953
Abstract
To allow the hidden units of a restricted Boltzmann machine to model the transformation between two successive images, Memisevic and Hinton (2007) introduced three-way multiplicative interactions that use the intensity of a pixel in the first image as a multiplicative gain on a learned, symmetric weight between a pixel in the second image and a hidden unit. This creates cubically many parameters, which form a three-dimensional interaction tensor. We describe a low-rank approximation to this interaction tensor that uses a sum of factors, each of which is a three-way outer product. This approximation allows efficient learning of transformations between larger image patches. Since each factor can be viewed as an image filter, the model as a whole learns optimal filter pairs for efficiently representing transformations. We demonstrate the learning of optimal filter pairs from various synthetic and real image sequences. We also show how learning about image transformations allows the model to perform a simple visual analogy task, and we show how a completely unsupervised network trained on transformations perceives multiple motions of transparent dot patterns in the same way as humans.Keywords
This publication has 12 references indexed in Scilit:
- Learning the Lie Groups of Visual InvarianceNeural Computation, 2007
- Unsupervised Learning of Image TransformationsPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2007
- Bilinear models of natural imagesPublished by SPIE-Intl Soc Optical Eng ,2007
- Bilinear Sparse Coding for Invariant VisionNeural Computation, 2005
- An efficient algorithm for learning invariance in adaptive classifiersPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Training Products of Experts by Minimizing Contrastive DivergenceNeural Computation, 2002
- Separating Style and Content with Bilinear ModelsNeural Computation, 2000
- The “independent components” of natural scenes are edge filtersVision Research, 1997
- Emergence of simple-cell receptive field properties by learning a sparse code for natural imagesNature, 1996
- Learning Invariance from Transformation SequencesNeural Computation, 1991