Finding and tracking people from the bottom up

Abstract
We describe a tracker that can track moving people in long sequences without manual initialization. Moving peo- ple are modeled with the assumption that, while configura- tion can vary quite substantially from frame to frame, ap- pearance does not. This leads to an algorithm that firstly builds a model of the appearance of the body of each indi- vidual by clustering candidate body segments, and then uses this model to find all individuals in each frame. Unusually, the tracker does not rely on a model of human dynamics to identify possible instances of people; such models are un- reliable, because human motion is fast and large accelera- tions are common. We show our tracking algorithm can be interpreted as a loopy inference procedure on an underlying Bayes net. Experiments on video of real scenes demonstrate that this tracker can (a) count distinct individuals; (b) iden- tify and track them; (c) recover when it loses track, for ex- ample, if individuals are occluded or briefly leave the view; (d) identify the configuration of the body largely correctly; and (e) is not dependent on particular models of human mo- tion.

This publication has 10 references indexed in Scilit: