3 months ago

Efficient Multi-stream Temporal Learning and Post-fusion Strategy for 3D Skeleton-based Hand Activity Recognition

{Renaud Seguier Jérôme Royan Amine Kacete Nam-Duong Duong Catherine Soladie Yasser Boutaleb}

Abstract

Recognizing first-person hand activity is a challenging task, especially when not enough data are available. In this paper, we tackle this challenge by proposing a new hybrid learning pipeline for skeleton-based hand activity recognition, which is composed of three blocks. First, for a given sequence of hand’s joint positions, the spatial features are extracted using a dedicated combination of local and global spatial hand-crafted features. Then, the temporal dependencies are learned using a multi-stream learning strategy. Finally, a hand activity sequence classifier is learned, via our Post-fusion strategy, applied to the previously learned temporal dependencies. The experiments, evaluated on two real-world data sets, show that our approach performs better than the state-of-the-art approaches. For more ablation studies, we compared our Post-fusion strategy with three traditional fusion baselines and showed an improvement above 2.4% of accuracy.

Benchmarks

Benchmark	Methodology	Metrics
skeleton-based-action-recognition-on-first	Boutaleb et al.	1:1 Accuracy: 96.17

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning