HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

ActionFormer: Localizing Moments of Actions with Transformers

Chenlin Zhang Jianxin Wu Yin Li

ActionFormer: Localizing Moments of Actions with Transformers

Abstract

Self-attention based Transformer models have demonstrated impressive results for image classification and object detection, and more recently for video understanding. Inspired by this success, we investigate the application of Transformer networks for temporal action localization in videos. To this end, we present ActionFormer -- a simple yet powerful model to identify actions in time and recognize their categories in a single shot, without using action proposals or relying on pre-defined anchor windows. ActionFormer combines a multiscale feature representation with local self-attention, and uses a light-weighted decoder to classify every moment in time and estimate the corresponding action boundaries. We show that this orchestrated design results in major improvements upon prior works. Without bells and whistles, ActionFormer achieves 71.0% mAP at tIoU=0.5 on THUMOS14, outperforming the best prior model by 14.1 absolute percentage points. Further, ActionFormer demonstrates strong results on ActivityNet 1.3 (36.6% average mAP) and EPIC-Kitchens 100 (+13.5% average mAP over prior works). Our code is available at http://github.com/happyharrycn/actionformer_release.

Code Repositories

happyharrycn/actionformer_release
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
temporal-action-localization-on-activitynetActionFormer (TSP feautures)
mAP: 36.6
mAP IOU@0.5: 54.7
mAP IOU@0.75: 37.8
mAP IOU@0.95: 8.4
temporal-action-localization-on-epic-kitchensActionFormer (verb)
Avg mAP (0.1-0.5): 23.5
mAP IOU@0.1: 26.6
mAP IOU@0.2: 25.4
mAP IOU@0.3: 24.2
mAP IOU@0.4: 22.3
mAP IOU@0.5: 19.1
temporal-action-localization-on-thumos14ActionFormer (I3D features)
Avg mAP (0.3:0.7): 66.8
mAP IOU@0.3: 82.1
mAP IOU@0.4: 77.8
mAP IOU@0.5: 71.0
mAP IOU@0.6: 59.4
mAP IOU@0.7: 43.9

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp