HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Hamba: Single-view 3D Hand Reconstruction with Graph-guided Bi-Scanning Mamba

Dong Haoye ; Chharia Aviral ; Gou Wenbo ; Carrasco Francisco Vicente ; De la Torre Fernando

Hamba: Single-view 3D Hand Reconstruction with Graph-guided Bi-Scanning
  Mamba

Abstract

3D Hand reconstruction from a single RGB image is challenging due to thearticulated motion, self-occlusion, and interaction with objects. Existing SOTAmethods employ attention-based transformers to learn the 3D hand pose andshape, yet they do not fully achieve robust and accurate performance, primarilydue to inefficiently modeling spatial relations between joints. To address thisproblem, we propose a novel graph-guided Mamba framework, named Hamba, whichbridges graph learning and state space modeling. Our core idea is toreformulate Mamba's scanning into graph-guided bidirectional scanning for 3Dreconstruction using a few effective tokens. This enables us to efficientlylearn the spatial relationships between joints for improving reconstructionperformance. Specifically, we design a Graph-guided State Space (GSS) blockthat learns the graph-structured relations and spatial sequences of joints anduses 88.5% fewer tokens than attention-based methods. Additionally, weintegrate the state space features and the global features using a fusionmodule. By utilizing the GSS block and the fusion module, Hamba effectivelyleverages the graph-guided state space features and jointly considers globaland local features to improve performance. Experiments on several benchmarksand in-the-wild tests demonstrate that Hamba significantly outperforms existingSOTAs, achieving the PA-MPVPE of 5.3mm and F@15mm of 0.992 on FreiHAND. At thetime of this paper's acceptance, Hamba holds the top position, Rank 1 in twoCompetition Leaderboards on 3D hand reconstruction. Project Website:https://humansensinglab.github.io/Hamba/

Code Repositories

humansensinglab/Hamba
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
3d-hand-pose-estimation-on-freihandHamba
PA-F@15mm: 0.992
PA-F@5mm: 0.806
PA-MPJPE: 5.7
PA-MPVPE: 5.3
3d-hand-pose-estimation-on-hint-handHamba
PCK@0.05 (New Days) All: 48.7
PCK@0.05 (NewDays) Occ: 28.2
PCK@0.05 (NewDays) Visible: 61.2
PCK@0.05 (VISOR) All: 47.2
PCK@0.05 (VISOR) Occ: 29.9
PCK@0.05 (VISOR) Visible: 61.4
3d-hand-pose-estimation-on-ho-3dHamba
AUC_J: 0.850
AUC_V: 0.846
F@15mm: 0.982
F@5mm: 0.648
PA-MPJPE (mm): 7.5
PA-MPVPE: 7.7
3d-hand-pose-estimation-on-ho-3d-v3Hamba
AUC_J: 0.861
AUC_V: 0.864
F@15mm: 0.982
F@5mm: 0.681
PA-MPJPE: 6.9
PA-MPVPE: 6.8

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp