Command Palette
Search for a command to run...
Guo Wen ; Bie Xiaoyu ; Alameda-Pineda Xavier ; Moreno-Noguer Francesc

Abstract
Human motion prediction aims to forecast future poses given a sequence ofpast 3D skeletons. While this problem has recently received increasingattention, it has mostly been tackled for single humans in isolation. In thispaper, we explore this problem when dealing with humans performingcollaborative tasks, we seek to predict the future motion of two interactedpersons given two sequences of their past skeletons. We propose a novel crossinteraction attention mechanism that exploits historical information of bothpersons, and learns to predict cross dependencies between the two posesequences. Since no dataset to train such interactive situations is available,we collected ExPI (Extreme Pose Interaction), a new lab-based personinteraction dataset of professional dancers performing Lindy-hop dancingactions, which contains 115 sequences with 30K frames annotated with 3D bodyposes and shapes. We thoroughly evaluate our cross interaction network on ExPIand show that both in short- and long-term predictions, it consistentlyoutperforms state-of-the-art methods for single-person motion prediction.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| multi-person-pose-forecasting-on-expi-common | XIA | Average MPJPE (mm) @ 1000 ms: 238 Average MPJPE (mm) @ 200 ms: 55 Average MPJPE (mm) @ 400 ms: 112 Average MPJPE (mm) @ 600 ms: 162 |
| multi-person-pose-forecasting-on-expi-unseen | XIA | Average MPJPE (mm) @ 400 ms: 121 Average MPJPE (mm) @ 600 ms: 174 Average MPJPE (mm) @ 800 ms: 218 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.