Learning to recognize human action sequences

Abstract
One of the major sources of cues in developmental learning is that of watching another person. An observer can gain a comprehensive description of the purposes of actions by watching the other person's detailed bode, movements. Action recognition has traditionally studied processing fixed camera observations while ignoring nonvisual information. This paper explores the dynamic properties of eye movements in natural tasks: eye and head movements are quite tightly coupled with actions. We present a method that utilizes eye gaze and head position information to detect the performer's focus of attention. Attention, as represented by eye fixation, is used for spotting the target object related to the action. Attention switches are calculated and used to segment the action sequence into action units which are recognized by hidden Markov models. An experimental system is built for recognizing actions in the natural task of "stapling a letter", which demonstrates the effectiveness of the approach.

This publication has 15 references indexed in Scilit: