HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

3D Human Pose Estimation with Spatial and Temporal Transformers

Zheng Ce ; Zhu Sijie ; Mendieta Matias ; Yang Taojiannan ; Chen Chen ; Ding Zhengming

3D Human Pose Estimation with Spatial and Temporal Transformers

Abstract

Transformer architectures have become the model of choice in natural languageprocessing and are now being introduced into computer vision tasks such asimage classification, object detection, and semantic segmentation. However, inthe field of human pose estimation, convolutional architectures still remaindominant. In this work, we present PoseFormer, a purely transformer-basedapproach for 3D human pose estimation in videos without convolutionalarchitectures involved. Inspired by recent developments in vision transformers,we design a spatial-temporal transformer structure to comprehensively model thehuman joint relations within each frame as well as the temporal correlationsacross frames, then output an accurate 3D human pose of the center frame. Wequantitatively and qualitatively evaluate our method on two popular andstandard benchmark datasets: Human3.6M and MPI-INF-3DHP. Extensive experimentsshow that PoseFormer achieves state-of-the-art performance on both datasets.Code is available at \url{https://github.com/zczcwh/PoseFormer}

Code Repositories

zczcwh/DL-HPE
Mentioned in GitHub
zczcwh/PoseFormer
Official
pytorch
Mentioned in GitHub
thuxyz19/test
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
3d-human-pose-estimation-on-human36mPoseFormer (f=81)
Average MPJPE (mm): 44.3
Multi-View or Monocular: Monocular
Using 2D ground-truth joints: No
3d-human-pose-estimation-on-human36mPoseFormer (f=81, GT)
Average MPJPE (mm): 31.3
Multi-View or Monocular: Monocular
Using 2D ground-truth joints: Yes
3d-human-pose-estimation-on-humaneva-iPoseFormer
Mean Reconstruction Error (mm): 21.6
3d-human-pose-estimation-on-mpi-inf-3dhpPoseFormer (9 frames)
AUC: 56.4
MPJPE: 77.1
PCK: 88.6
monocular-3d-human-pose-estimation-on-human3PoseFormer (T=81)
2D detector: CPN
Average MPJPE (mm): 44.3
Frames Needed: 81

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp