Learning to Represent Spatial Transformations with Factored Higher-Order Boltzmann Machines

1 June 2010

journal article
Published by MIT Press in Neural Computation

Vol. 22 (6), 1473-1492
https://doi.org/10.1162/neco.2010.01-09-953

Abstract

To allow the hidden units of a restricted Boltzmann machine to model the transformation between two successive images, Memisevic and Hinton (2007) introduced three-way multiplicative interactions that use the intensity of a pixel in the first image as a multiplicative gain on a learned, symmetric weight between a pixel in the second image and a hidden unit. This creates cubically many parameters, which form a three-dimensional interaction tensor. We describe a low-rank approximation to this interaction tensor that uses a sum of factors, each of which is a three-way outer product. This approximation allows efficient learning of transformations between larger image patches. Since each factor can be viewed as an image filter, the model as a whole learns optimal filter pairs for efficiently representing transformations. We demonstrate the learning of optimal filter pairs from various synthetic and real image sequences. We also show how learning about image transformations allows the model to perform a simple visual analogy task, and we show how a completely unsupervised network trained on transformations perceives multiple motions of transparent dot patterns in the same way as humans.

Keywords

This publication has 12 references indexed in Scilit:

Learning the Lie Groups of Visual Invariance
Neural Computation, 2007
Unsupervised Learning of Image Transformations
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2007
Bilinear models of natural images
Published by SPIE-Intl Soc Optical Eng ,2007
Bilinear Sparse Coding for Invariant Vision
Neural Computation, 2005
An efficient algorithm for learning invariance in adaptive classifiers
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2003
Training Products of Experts by Minimizing Contrastive Divergence
Neural Computation, 2002
Separating Style and Content with Bilinear Models
Neural Computation, 2000
The “independent components” of natural scenes are edge filters
Vision Research, 1997
Emergence of simple-cell receptive field properties by learning a sparse code for natural images
Nature, 1996
Learning Invariance from Transformation Sequences
Neural Computation, 1991

Cited by 114 articles