Command Palette
Search for a command to run...
Learn to cycle: Time-consistent feature discovery for action recognition
Alexandros Stergiou Ronald Poppe

Abstract
Generalizing over temporal variations is a prerequisite for effective action recognition in videos. Despite significant advances in deep neural networks, it remains a challenge to focus on short-term discriminative motions in relation to the overall performance of an action. We address this challenge by allowing some flexibility in discovering relevant spatio-temporal features. We introduce Squeeze and Recursion Temporal Gates (SRTG), an approach that favors inputs with similar activations with potential temporal variations. We implement this idea with a novel CNN block that uses an LSTM to encapsulate feature dynamics, in conjunction with a temporal gate that is responsible for evaluating the consistency of the discovered dynamics and the modeled features. We show consistent improvement when using SRTG blocks, with only a minimal increase in the number of GFLOPs. On Kinetics-700, we perform on par with current state-of-the-art models, and outperform these on HACS, Moments in Time, UCF-101 and HMDB-51.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| action-classification-on-kinetics-700 | SRTG r(2+1)d-34 | Top-1 Accuracy: 49.43 Top-5 Accuracy: 73.23 |
| action-classification-on-kinetics-700 | SRTG r3d-50 | Top-1 Accuracy: 53.52 Top-5 Accuracy: 74.17 |
| action-classification-on-kinetics-700 | SRTG r3d-101 | Top-1 Accuracy: 56.46 Top-5 Accuracy: 76.82 |
| action-classification-on-kinetics-700 | SRTG r3d-34 | Top-1 Accuracy: 49.15 Top-5 Accuracy: 72.68 |
| action-classification-on-kinetics-700 | SRTG r(2+1)d-50 | Top-1 Accuracy: 54.17 Top-5 Accuracy: 74.62 |
| action-classification-on-moments-in-time | SRTG r3d-34 | Top 1 Accuracy: 28.55 Top 5 Accuracy: 52.35 |
| action-classification-on-moments-in-time | SRTG r3d-101 | Top 1 Accuracy: 33.56 Top 5 Accuracy: 58.49 |
| action-classification-on-moments-in-time | SRTG r3d-50 | Top 1 Accuracy: 30.72 Top 5 Accuracy: 55.65 |
| action-classification-on-moments-in-time | SRTG r(2+1)d-50 | Top 1 Accuracy: 31.60 Top 5 Accuracy: 56.80 |
| action-classification-on-moments-in-time | SRTG r(2+1)d-34 | Top 1 Accuracy: 28.97 Top 5 Accuracy: 54.18 |
| action-recognition-on-hacs | SRTG r(2+1)d-101 | Top 1 Accuracy: 84.33 Top 5 Accuracy: 96.85 |
| action-recognition-on-hacs | SRTG r3d-34 | Top 1 Accuracy: 78.60 Top 5 Accuracy: 93.57 |
| action-recognition-on-hacs | SRTG r3d-101 | Top 1 Accuracy: 81.66 Top 5 Accuracy: 96.33 |
| action-recognition-on-hacs | SRTG r(2+1)d-50 | Top 1 Accuracy: 83.77 Top 5 Accuracy: 96.56 |
| action-recognition-on-hacs | SRTG r(2+1)d-34 | Top 1 Accuracy: 80.39 Top 5 Accuracy: 94.27 |
| action-recognition-on-hacs | SRTG r3d-50 | Top 1 Accuracy: 80.36 Top 5 Accuracy: 95.55 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.