HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Lightweight Multi-View 3D Pose Estimation through Camera-Disentangled Representation

Edoardo Remelli Shangchen Han Sina Honari Pascal Fua Robert Wang

Lightweight Multi-View 3D Pose Estimation through Camera-Disentangled Representation

Abstract

We present a lightweight solution to recover 3D pose from multi-view images captured with spatially calibrated cameras. Building upon recent advances in interpretable representation learning, we exploit 3D geometry to fuse input images into a unified latent representation of pose, which is disentangled from camera view-points. This allows us to reason effectively about 3D pose across different views without using compute-intensive volumetric grids. Our architecture then conditions the learned representation on camera projection operators to produce accurate per-view 2d detections, that can be simply lifted to 3D via a differentiable Direct Linear Transform (DLT) layer. In order to do it efficiently, we propose a novel implementation of DLT that is orders of magnitude faster on GPU architectures than standard SVD-based triangulation methods. We evaluate our approach on two large-scale human pose datasets (H36M and Total Capture): our method outperforms or performs comparably to the state-of-the-art volumetric methods, while, unlike them, yielding real-time performance.

Benchmarks

BenchmarkMethodologyMetrics
3d-human-pose-estimation-on-human36mLWCDR (extra train data)
Average MPJPE (mm): 21.0
Multi-View or Monocular: Multi-View
Using 2D ground-truth joints: No
3d-human-pose-estimation-on-human36mLWCDR
Average MPJPE (mm): 30.2
Multi-View or Monocular: Multi-View
Using 2D ground-truth joints: No
3d-human-pose-estimation-on-total-captureLWCDR
Average MPJPE (mm): 27.5

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Lightweight Multi-View 3D Pose Estimation through Camera-Disentangled Representation | Papers | HyperAI