Learning Visual Object Categories for Robot Affordance Prediction

23 December 2009

journal article
research article
Published by SAGE Publications in The International Journal of Robotics Research

Vol. 29 (2-3), 174-197
https://doi.org/10.1177/0278364909356602

Abstract

A fundamental requirement of any autonomous robot system is the ability to predict the affordances of its environment. The set of affordances define the actions that are available to the agent given the robot’s context. A standard approach to affordance learning is direct perception, which learns direct mappings from sensor measurements to affordance labels. For example, a robot designed for cross-country navigation could map stereo depth information and image features directly into predictions about the traversability of terrain regions. While this approach can succeed for a small number of affordances, it does not scale well as the number of affordances increases. In this paper, we show that visual object categories can be used as an intermediate representation that makes the affordance learning problem scalable. We develop a probabilistic graphical model which we call the Category—Affordance (CA) model, which describes the relationships between object categories, affordances, and appearance. This model casts visual object categorization as an intermediate inference step in affordance prediction. We describe several novel affordance learning and training strategies that are supported by our new model. Experimental results with indoor mobile robots evaluate these different strategies and demonstrate the advantages of the CA model in affordance learning, especially when learning from limited size data sets.

Keywords

This publication has 18 references indexed in Scilit:

Boosted Bayesian network classifiers
Machine Learning, 2008
Gibsonian Affordances for Roboticists
Adaptive Behavior, 2007
Hybrid Generative-Discriminative Visual Categorization
International Journal of Computer Vision, 2007
One-shot learning of object categories
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2006
A sensor fusion approach for recognizing continuous human grasping sequences using hidden Markov models
IEEE Transactions on Robotics, 2005
Distinctive Image Features from Scale-Invariant Keypoints
International Journal of Computer Vision, 2004
PERFORMANCE-DERIVED BEHAVIOR VOCABULARIES: DATA-DRIVEN ACQUISITION OF SKILLS FROM MOTION
International Journal of Humanoid Robotics, 2004
Graphical Models
Statistical Science, 2004
Better Vision through Manipulation
Adaptive Behavior, 2003
Hierarchical Mixtures of Experts and the EM Algorithm
Neural Computation, 1994

Cited by 43 articles