Learning Visual Object Categories for Robot Affordance Prediction
- 23 December 2009
- journal article
- research article
- Published by SAGE Publications in The International Journal of Robotics Research
- Vol. 29 (2-3), 174-197
- https://doi.org/10.1177/0278364909356602
Abstract
A fundamental requirement of any autonomous robot system is the ability to predict the affordances of its environment. The set of affordances define the actions that are available to the agent given the robot’s context. A standard approach to affordance learning is direct perception, which learns direct mappings from sensor measurements to affordance labels. For example, a robot designed for cross-country navigation could map stereo depth information and image features directly into predictions about the traversability of terrain regions. While this approach can succeed for a small number of affordances, it does not scale well as the number of affordances increases. In this paper, we show that visual object categories can be used as an intermediate representation that makes the affordance learning problem scalable. We develop a probabilistic graphical model which we call the Category—Affordance (CA) model, which describes the relationships between object categories, affordances, and appearance. This model casts visual object categorization as an intermediate inference step in affordance prediction. We describe several novel affordance learning and training strategies that are supported by our new model. Experimental results with indoor mobile robots evaluate these different strategies and demonstrate the advantages of the CA model in affordance learning, especially when learning from limited size data sets.Keywords
This publication has 18 references indexed in Scilit:
- Boosted Bayesian network classifiersMachine Learning, 2008
- Gibsonian Affordances for RoboticistsAdaptive Behavior, 2007
- Hybrid Generative-Discriminative Visual CategorizationInternational Journal of Computer Vision, 2007
- One-shot learning of object categoriesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2006
- A sensor fusion approach for recognizing continuous human grasping sequences using hidden Markov modelsIEEE Transactions on Robotics, 2005
- Distinctive Image Features from Scale-Invariant KeypointsInternational Journal of Computer Vision, 2004
- PERFORMANCE-DERIVED BEHAVIOR VOCABULARIES: DATA-DRIVEN ACQUISITION OF SKILLS FROM MOTIONInternational Journal of Humanoid Robotics, 2004
- Graphical ModelsStatistical Science, 2004
- Better Vision through ManipulationAdaptive Behavior, 2003
- Hierarchical Mixtures of Experts and the EM AlgorithmNeural Computation, 1994