HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Human Pose as Compositional Tokens

Zigang Geng Chunyu Wang Yixuan Wei Ze Liu Houqiang Li Han Hu

Human Pose as Compositional Tokens

Abstract

Human pose is typically represented by a coordinate vector of body joints or their heatmap embeddings. While easy for data processing, unrealistic pose estimates are admitted due to the lack of dependency modeling between the body joints. In this paper, we present a structured representation, named Pose as Compositional Tokens (PCT), to explore the joint dependency. It represents a pose by M discrete tokens with each characterizing a sub-structure with several interdependent joints. The compositional design enables it to achieve a small reconstruction error at a low cost. Then we cast pose estimation as a classification task. In particular, we learn a classifier to predict the categories of the M tokens from an image. A pre-learned decoder network is used to recover the pose from the tokens without further post-processing. We show that it achieves better or comparable pose estimation results as the existing methods in general scenarios, yet continues to work well when occlusion occurs, which is ubiquitous in practice. The code and models are publicly available at https://github.com/Gengzigang/PCT.

Code Repositories

gengzigang/pct
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
pose-estimation-on-coco-test-devPCT (256x256)
AP: 78.3
AP50: 92.9
AP75: 85.9
pose-estimation-on-mpii-human-posePCT (swin-l, test set)
PCKh-0.5: 94.3
pose-estimation-on-mpii-human-posePCT (swin-b, test set)
PCKh-0.5: 93.8

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp