Command Palette
Search for a command to run...
Semi-Supervised 3D Hand-Object Poses Estimation with Interactions in Time
Liu Shaowei ; Jiang Hanwen ; Xu Jiarui ; Liu Sifei ; Wang Xiaolong

Abstract
Estimating 3D hand and object pose from a single image is an extremelychallenging problem: hands and objects are often self-occluded duringinteractions, and the 3D annotations are scarce as even humans cannot directlylabel the ground-truths from a single image perfectly. To tackle thesechallenges, we propose a unified framework for estimating the 3D hand andobject poses with semi-supervised learning. We build a joint learning frameworkwhere we perform explicit contextual reasoning between hand and objectrepresentations by a Transformer. Going beyond limited 3D annotations in asingle image, we leverage the spatial-temporal consistency in large-scalehand-object videos as a constraint for generating pseudo labels insemi-supervised learning. Our method not only improves hand pose estimation inchallenging real-world dataset, but also substantially improve the object posewhich has fewer ground-truths per instance. By training with large-scalediverse videos, our model also generalizes better across multiple out-of-domaindatasets. Project page and code: https://stevenlsw.github.io/Semi-Hand-Object
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| 3d-hand-pose-estimation-on-dexycb | SHO | Average MPJPE (mm): 15.2 MPVPE: - PA-MPVPE: - PA-VAUC: - Procrustes-Aligned MPJPE: 6.58 VAUC: - |
| 3d-hand-pose-estimation-on-ho-3d | SHO | PA-MPJPE (mm): 10.1 |
| hand-object-pose-on-ho-3d | SHO | ADD-S: - Average MPJPE (mm): - OME: - PA-MPJPE: 10.1 ST-MPJPE: 31.7 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.