A model for size- and rotation-invariant pattern processing in the visual system

Abstract
The mapping of retinal space onto the striate cortex of some mammals can be approximated by a log-polar function. It has been proposed that this mapping is of functional importance for scale-and rotation-invariant pattern recognition in the visual system. An exact log-polar transform converts centered scaling and rotation into translations. A subsequent translation-invariant transform, such as the absolute value of the Fourier transform, thus generates overall size-and rotation-invariance. In our model, the translation-invariance is realized via the R-transform. This transform can be executed by simple neural networks, and it does not require the complex computations of the Fourier transform, used in Mellin-transform size-invariance models. The logarithmic space distortion and differentiation in the first processing stage of the model is realized via “Mexican hat” filters whose diameter increases linearly with eccentricity, similar to the characteristics of the receptive fields of retinal ganglion cells. Except for some special cases, the model can explain object recognition independent of size, orientation and position. Some general problems of Mellin-type size-invariance models-that also apply to our model-are discussed.

This publication has 36 references indexed in Scilit: