Pedestrian Parsing via Deep Decompositional Network

1 December 2013

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

No. 15505499,p. 2648-2655
https://doi.org/10.1109/iccv.2013.329

Abstract

We propose a new Deep Decompositional Network (DDN) for parsing pedestrian images into semantic regions, such as hair, head, body, arms, and legs, where the pedestrians can be heavily occluded. Unlike existing methods based on template matching or Bayesian inference, our approach directly maps low-level visual features to the label maps of body parts with DDN, which is able to accurately estimate complex pose variations with good robustness to occlusions and background clutters. DDN jointly estimates occluded regions and segments body parts by stacking three types of hidden layers: occlusion estimation layers, completion layers, and decomposition layers. The occlusion estimation layers estimate a binary mask, indicating which part of a pedestrian is invisible. The completion layers synthesize low-level features of the invisible part from the original features and the occlusion mask. The decomposition layers directly transform the synthesized visual features to label maps. We devise a new strategy to pre-train these hidden layers, and then fine-tune the entire network using the stochastic gradient descent. Experimental results show that our approach achieves better segmentation accuracy than the state-of-the-art methods on pedestrian images with or without occlusions. Another important contribution of this paper is that it provides a large scale benchmark human parsing dataset that includes 3,673 annotated samples collected from 171 surveillance videos. It is 20 times larger than existing public datasets.

Keywords

This publication has 14 references indexed in Scilit:

Deep Learning Identity-Preserving Face Space
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2013
A Deep Sum-Product Architecture for Robust Facial Attributes Analysis
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2013
Recursive Least Squares Dictionary Learning Algorithm
IEEE Transactions on Signal Processing, 2010
An HOG-LBP human detector with partial occlusion handling
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2009
Poselets: Body part detectors trained using 3D human pose annotations
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2009
Object Detection Combining Recognition and Segmentation
Published by Springer Nature ,2007
Histograms of Oriented Gradients for Human Detection
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2005
On the momentum term in gradient descent learning algorithms
Neural Networks, 1999
Gradient-based learning applied to document recognition
Proceedings of the IEEE, 1998
Eigenfaces for Recognition
Journal of Cognitive Neuroscience, 1991

Cited by 104 articles