5 months ago

RTMO: Towards High-Performance One-Stage Real-Time Multi-Person Pose Estimation

Lu Peng ; Jiang Tao ; Li Yining ; Li Xiangtai ; Chen Kai ; Yang Wenming

Abstract

Real-time multi-person pose estimation presents significant challenges inbalancing speed and precision. While two-stage top-down methods slow down asthe number of people in the image increases, existing one-stage methods oftenfail to simultaneously deliver high accuracy and real-time performance. Thispaper introduces RTMO, a one-stage pose estimation framework that seamlesslyintegrates coordinate classification by representing keypoints using dual 1-Dheatmaps within the YOLO architecture, achieving accuracy comparable totop-down methods while maintaining high speed. We propose a dynamic coordinateclassifier and a tailored loss function for heatmap learning, specificallydesigned to address the incompatibilities between coordinate classification anddense prediction models. RTMO outperforms state-of-the-art one-stage poseestimators, achieving 1.1% higher AP on COCO while operating about 9 timesfaster with the same backbone. Our largest model, RTMO-l, attains 74.8% AP onCOCO val2017 and 141 FPS on a single V100 GPU, demonstrating its efficiency andaccuracy. The code and models are available athttps://github.com/open-mmlab/mmpose/tree/main/projects/rtmo.

Code Repositories

open-mmlab/mmpose

Official

pytorch

Benchmarks

Benchmark	Methodology	Metrics
multi-person-pose-estimation-on-crowdpose	RTMO-l	AP Easy: 88.8 AP Hard: 77.2 AP Medium: 84.7 FPS: 52.4 mAP @0.5:0.95: 83.8

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette