HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

DG-STGCN: Dynamic Spatial-Temporal Modeling for Skeleton-based Action Recognition

Haodong Duan Jiaqi Wang Kai Chen Dahua Lin

DG-STGCN: Dynamic Spatial-Temporal Modeling for Skeleton-based Action Recognition

Abstract

Graph convolution networks (GCN) have been widely used in skeleton-based action recognition. We note that existing GCN-based approaches primarily rely on prescribed graphical structures (ie., a manually defined topology of skeleton joints), which limits their flexibility to capture complicated correlations between joints. To move beyond this limitation, we propose a new framework for skeleton-based action recognition, namely Dynamic Group Spatio-Temporal GCN (DG-STGCN). It consists of two modules, DG-GCN and DG-TCN, respectively, for spatial and temporal modeling. In particular, DG-GCN uses learned affinity matrices to capture dynamic graphical structures instead of relying on a prescribed one, while DG-TCN performs group-wise temporal convolutions with varying receptive fields and incorporates a dynamic joint-skeleton fusion module for adaptive multi-level temporal modeling. On a wide range of benchmarks, including NTURGB+D, Kinetics-Skeleton, BABEL, and Toyota SmartHome, DG-STGCN consistently outperforms state-of-the-art methods, often by a notable margin.

Benchmarks

BenchmarkMethodologyMetrics
skeleton-based-action-recognition-on-ntu-rgbdDG-STGCN
Accuracy (CS): 93.2
Accuracy (CV): 97.5
Ensembled Modalities: 4
skeleton-based-action-recognition-on-ntu-rgbd-1DG-STGCN
Accuracy (Cross-Setup): 91.3
Accuracy (Cross-Subject): 89.6
Ensembled Modalities: 4

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp