Command Palette
Search for a command to run...
Jiang Lei ; Wei Ye ; Ni Hao

Abstract
Diffusion models have become a popular choice for human motion synthesis dueto their powerful generative capabilities. However, their high computationalcomplexity and large sampling steps pose challenges for real-time applications.Fortunately, the Consistency Model (CM) provides a solution to greatly reducethe number of sampling steps from hundreds to a few, typically fewer than four,significantly accelerating the synthesis of diffusion models. However, applyingCM to text-conditioned human motion synthesis in latent space yieldsunsatisfactory generation results. In this paper, we introduce\textbf{MotionPCM}, a phased consistency model-based approach designed toimprove the quality and efficiency for real-time motion synthesis in latentspace. Experimental results on the HumanML3D dataset show that our modelachieves real-time inference at over 30 frames per second in a single samplingstep while outperforming the previous state-of-the-art with a 38.9\%improvement in FID. The code will be available for reproduction.
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| motion-synthesis-on-humanml3d | MotionPCM | Diversity: 9.575 FID: 0.030 Multimodality: 1.714 R Precision Top3: 0.842 |
| motion-synthesis-on-kit-motion-language | MotionPCM | Diversity: 10.827 FID: 0.294 Multimodality: 1.254 R Precision Top3: 0.787 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.