Direct and Indirect Cooperation between Temporal and Parietal Networks for Invariant Visual Recognition

1 January 1992

journal article
Published by MIT Press in Journal of Cognitive Neuroscience

Vol. 4 (1), 35-57
https://doi.org/10.1162/jocn.1992.4.1.35

Abstract

A new type of biologically inspired multilayered network is proposed to model the properties of the primate visual system with respect to invariant visual recognition (IVR). This model is based on 10 major neurobiological and psychological constraints. The first five constraints shape the architecture and properties of the network. 1. The network model has a Y-like double-branched multilayered architecture, with one input (the retina) and two parallel outputs, the "What" and the "Where," which model, respectively, the temporal pathway, specialized for "object" identification, and the parietal pathway specialized for "spatial" localization. 2. Four processing layers are sufficient to model the main functional steps of primate visual system that transform the retinal information into prototypes (object-centered reference frame) in the "What" branch and into an oculomotor command in the "Where" branch. 3. The distribution of receptive field sizes within and between the two functional pathways provides an appropriate tradeoff between discrimination and invariant recognition capabilities. 4. The two outputs are represented by a population coding: the ocular command is computed as a population vector in the "Where" branch and the prototypes are coded in a "semidistributed" way in the "What" branch. In the intermediate associative steps, processing units learn to associate prototypes (through feedback connections) to component features (through feedforward ones). 5. The basic processing units of the network do not model single cells but model the local neuronal circuits that combine different information flows organized in separate cortical layers. Such a biologically constrained model shows shift-invariant and size-invariant capabilities that resemble those of humans (psychological constraints): 6. During the Learning session, a set of patterns (26 capital letters and 2 geometric figures) are presented to the network: a single presentation of each pattern in one position (at the center) and with one size is sufficient to learn the corresponding prototypes (internal representations). These patterns are thus presented in widely varying new sizes and positions during the Recognition session: 7. The "What" branch of the network succeeds in immediate recognition for patterns presented in the central zone of the retina with the learned size. 8. The recognition by the "What" branch is resistant to changes in size within a limited range of variation related to the distribution of receptive field (RF) sizes in the successive processing steps of this pathway. 9. Even when ocular movements are not allowed, the recognition capabilities of the "What" branch are unaffected by changing positions around the learned one. This significant shift-invariance of the "What" branch is also related to the distribution of RF sizes. 10. When varying both sizes and locations, the "What" and the "Where" branches cooperate for recognition: the location coding in the "Where" branch can command, under the control of the "What" branch, an ocular movement efficient to reset peripheral patterns toward the central zone of the retina until successful recognition. This model results in predictions about anatomical connections and physiological interactions between temporal and parietal cortices.

Keywords

This publication has 63 references indexed in Scilit:

The cortical column: A new processing unit for multilayered networks
Neural Networks, 1991
Visual and Eye Movement Functions of the Posterior Parietal Cortex
Annual Review of Neuroscience, 1989
Shifter circuits: a computational strategy for dynamic aspects of visual processing.
Proceedings of the National Academy of Sciences, 1987
A Neural Network for Invariant Pattern Recognition
Europhysics Letters, 1987
Encoding of Spatial Location by Posterior Parietal Neurons
Science, 1985
Segregation of efferent connections and receptive field properties in visual area V2 of the macaque
Nature, 1985
Participants
Cytogenetic and Genome Research, 1982
Effects of inferotemporal lesions on the behavior of monkeys.
Psychological Bulletin, 1976
The topography of the afferent projections in the circumstriate visual cortex of the monkey studied by the nauta method
Vision Research, 1969
A retrograde cell degeneration study of the cortical projection field of the pulvinar in the monkey
Journal of Comparative Neurology, 1950

Cited by 18 articles