HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Multi-HMR: Multi-Person Whole-Body Human Mesh Recovery in a Single Shot

Baradel Fabien ; Armando Matthieu ; Galaaoui Salma ; Brégier Romain ; Weinzaepfel Philippe ; Rogez Grégory ; Lucas Thomas

Multi-HMR: Multi-Person Whole-Body Human Mesh Recovery in a Single Shot

Abstract

We present Multi-HMR, a strong sigle-shot model for multi-person 3D humanmesh recovery from a single RGB image. Predictions encompass the whole body,i.e., including hands and facial expressions, using the SMPL-X parametric modeland 3D location in the camera coordinate system. Our model detects people bypredicting coarse 2D heatmaps of person locations, using features produced by astandard Vision Transformer (ViT) backbone. It then predicts their whole-bodypose, shape and 3D location using a new cross-attention module called the HumanPrediction Head (HPH), with one query attending to the entire set of featuresfor each detected person. As direct prediction of fine-grained hands and facialposes in a single shot, i.e., without relying on explicit crops around bodyparts, is hard to learn from existing data, we introduce CUFFS, the Close-UpFrames of Full-Body Subjects dataset, containing humans close to the camerawith diverse hand poses. We show that incorporating it into the training datafurther enhances predictions, particularly for hands. Multi-HMR also optionallyaccounts for camera intrinsics, if available, by encoding camera ray directionsfor each image token. This simple design achieves strong performance onwhole-body and body-only benchmarks simultaneously: a ViT-S backbone on$448{\times}448$ images already yields a fast and competitive model, whilelarger models and higher resolutions obtain state-of-the-art results.

Code Repositories

naver/multi-hmr
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
3d-human-pose-estimation-on-3dpwMulti-HMR
MPJPE: 61.4
MPVPE: 75.9
PA-MPJPE: 41.7
3d-human-pose-estimation-on-ubodyMulti-HMR
PA-PVE-All: 23.6
PA-PVE-Face: 1.8
PA-PVE-Hands: 7.0
PVE-All: 56.4
PVE-Face: 19.3
PVE-Hands: 24.9
3d-human-reconstruction-on-ehfMulti-HMR
MPVPE: 44.2
PA V2V (mm), face: 5.5
PA V2V (mm), whole body: 32.7
3d-multi-person-human-pose-estimation-onMulti-HMR
3DPCK: 89.5
3d-multi-person-mesh-recovery-on-agoraMulti-HMR
FB-MVE: 95.9
FB-NMVE: 102.0
human-mesh-recovery-on-bedlamMulti-HMR
PVE-All: 76.80

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp