Command Palette
Search for a command to run...
Graph and Temporal Convolutional Networks for 3D Multi-person Pose Estimation in Monocular Videos
Cheng Yu ; Wang Bo ; Yang Bo ; Tan Robby T.

Abstract
Despite the recent progress, 3D multi-person pose estimation from monocularvideos is still challenging due to the commonly encountered problem of missinginformation caused by occlusion, partially out-of-frame target persons, andinaccurate person detection. To tackle this problem, we propose a novelframework integrating graph convolutional networks (GCNs) and temporalconvolutional networks (TCNs) to robustly estimate camera-centric multi-person3D poses that do not require camera parameters. In particular, we introduce ahuman-joint GCN, which, unlike the existing GCN, is based on a directed graphthat employs the 2D pose estimator's confidence scores to improve the poseestimation results. We also introduce a human-bone GCN, which models the boneconnections and provides more information beyond human joints. The two GCNswork together to estimate the spatial frame-wise 3D poses and can make use ofboth visible joint and bone information in the target frame to estimate theoccluded or missing human-part information. To further refine the 3D poseestimation, we use our temporal convolutional networks (TCNs) to enforce thetemporal and human-dynamics constraints. We use a joint-TCN to estimateperson-centric 3D poses across frames, and propose a velocity-TCN to estimatethe speed of 3D joints to ensure the consistency of the 3D pose estimation inconsecutive frames. Finally, to estimate the 3D human poses for multiplepersons, we propose a root-TCN that estimates camera-centric 3D poses withoutrequiring camera parameters. Quantitative and qualitative evaluationsdemonstrate the effectiveness of the proposed method.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| 3d-absolute-human-pose-estimation-on-human36m | GnTCN | MRPE: 88.1 |
| 3d-human-pose-estimation-on-3dpw | GnTCN | PA-MPJPE: 64.2 |
| 3d-human-pose-estimation-on-human36m | GnTCN | Average MPJPE (mm): 40.9 Multi-View or Monocular: Monocular PA-MPJPE: 30.4 Using 2D ground-truth joints: No |
| 3d-multi-person-pose-estimation-absolute-on | GnTCN | 3DPCK: 45.7 |
| 3d-multi-person-pose-estimation-root-relative | GnTCN | 3DPCK: 87.5 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.