Learning Visual Object Categories for Robot Affordance Prediction

Abstract
A fundamental requirement of any autonomous robot system is the ability to predict the affordances of its environment. The set of affordances define the actions that are available to the agent given the robot’s context. A standard approach to affordance learning is direct perception, which learns direct mappings from sensor measurements to affordance labels. For example, a robot designed for cross-country navigation could map stereo depth information and image features directly into predictions about the traversability of terrain regions. While this approach can succeed for a small number of affordances, it does not scale well as the number of affordances increases. In this paper, we show that visual object categories can be used as an intermediate representation that makes the affordance learning problem scalable. We develop a probabilistic graphical model which we call the Category—Affordance (CA) model, which describes the relationships between object categories, affordances, and appearance. This model casts visual object categorization as an intermediate inference step in affordance prediction. We describe several novel affordance learning and training strategies that are supported by our new model. Experimental results with indoor mobile robots evaluate these different strategies and demonstrate the advantages of the CA model in affordance learning, especially when learning from limited size data sets.

This publication has 18 references indexed in Scilit: