Command Palette
Search for a command to run...
FineParser: A Fine-grained Spatio-temporal Action Parser for Human-centric Action Quality Assessment
Xu Jinglin ; Yin Sibo ; Zhao Guohao ; Wang Zishuo ; Peng Yuxin

Abstract
Existing action quality assessment (AQA) methods mainly learn deeprepresentations at the video level for scoring diverse actions. Due to the lackof a fine-grained understanding of actions in videos, they harshly suffer fromlow credibility and interpretability, thus insufficient for stringentapplications, such as Olympic diving events. We argue that a fine-grainedunderstanding of actions requires the model to perceive and parse actions inboth time and space, which is also the key to the credibility andinterpretability of the AQA technique. Based on this insight, we propose a newfine-grained spatial-temporal action parser named \textbf{FineParser}. Itlearns human-centric foreground action representations by focusing on targetaction regions within each frame and exploiting their fine-grained alignmentsin time and space to minimize the impact of invalid backgrounds during theassessment. In addition, we construct fine-grained annotations of human-centricforeground action masks for the FineDiving dataset, called\textbf{FineDiving-HM}. With refined annotations on diverse target actionprocedures, FineDiving-HM can promote the development of real-world AQAsystems. Through extensive experiments, we demonstrate the effectiveness ofFineParser, which outperforms state-of-the-art methods while supporting moretasks of fine-grained action understanding. Data and code are available at\url{https://github.com/PKU-ICST-MIPL/FineParser_CVPR2024}.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| action-quality-assessment-on-finediving | FineParser | RL2(*100): 0.2602 Spearman Correlation: 0.9435 |
| action-quality-assessment-on-mtl-aqa | FineParser | RL2(*100): 0.241 Spearman Correlation: 95.85 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.