Self-Attention Network for Human Pose Estimation

Open Access

18 February 2021

journal article
research article
Published by MDPI AG in Applied Sciences

Vol. 11 (4), 1826
https://doi.org/10.3390/app11041826

Abstract

Estimating the positions of human joints from monocular single RGB images has been a challenging task in recent years. Despite great progress in human pose estimation with convolutional neural networks (CNNs), a central problem still exists: the relationships and constraints, such as symmetric relations of human structures, are not well exploited in previous CNN-based methods. Considering the effectiveness of combining local and nonlocal consistencies, we propose an end-to-end self-attention network (SAN) to alleviate this issue. In SANs, attention-driven and long-range dependency modeling are adopted between joints to compensate for local content and mine details from all feature locations. To enable an SAN for both 2D and 3D pose estimations, we also design a compatible, effective and general joint learning framework to mix up the usage of different dimension data. We evaluate the proposed network on challenging benchmark datasets. The experimental results show that our method has significantly achieved competitive results on Human3.6M, MPII and COCO datasets.

Keywords

Funding Information

National Natural Science Foundation of China (61976022)

This publication has 35 references indexed in Scilit:

Knowledge-Guided Deep Fractal Neural Networks for Human Pose Estimation
IEEE Transactions on Multimedia, 2017
Towards 3D Human Pose Estimation in the Wild: A Weakly-Supervised Approach
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2017
Tower Crane Remote Wireless Monitoring System Based on Modbus/Tcp Protocol
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2017
Multi-context Attention for Human Pose Estimation
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2017
Coarse-to-Fine Volumetric Prediction for Single-Image 3D Human Pose
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2017
ArtTrack: Articulated Multi-Person Tracking in the Wild
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2017
Stacked Hourglass Networks for Human Pose Estimation
Published by Springer Nature ,2016
3D Human Pose Estimation from Monocular Images with Deep Convolutional Neural Network
Published by Springer Nature ,2015
Microsoft COCO: Common Objects in Context
Lecture Notes in Computer Science, 2014
Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013

Cited by 2 articles