HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Holistic Interaction Transformer Network for Action Detection

Faure Gueter Josmy ; Chen Min-Hung ; Lai Shang-Hong

Holistic Interaction Transformer Network for Action Detection

Abstract

Actions are about how we interact with the environment, including otherpeople, objects, and ourselves. In this paper, we propose a novel multi-modalHolistic Interaction Transformer Network (HIT) that leverages the largelyignored, but critical hand and pose information essential to most humanactions. The proposed "HIT" network is a comprehensive bi-modal framework thatcomprises an RGB stream and a pose stream. Each of them separately modelsperson, object, and hand interactions. Within each sub-network, anIntra-Modality Aggregation module (IMA) is introduced that selectively mergesindividual interaction units. The resulting features from each modality arethen glued using an Attentive Fusion Mechanism (AFM). Finally, we extract cuesfrom the temporal context to better classify the occurring actions using cachedmemory. Our method significantly outperforms previous approaches on the J-HMDB,UCF101-24, and MultiSports datasets. We also achieve competitive results onAVA. The code will be available at https://github.com/joslefaure/HIT.

Code Repositories

joslefaure/hit
Official
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
action-detection-on-j-hmdbHIT
Frame-mAP 0.5: 83.8
Video-mAP 0.2: 89.7
Video-mAP 0.5: 88.1
action-detection-on-multisportsHIT
Frame-mAP 0.5: 33.3
Video-mAP 0.2: 27.8
Video-mAP 0.5: 8.8
action-detection-on-ucf101-24HIT
Frame-mAP 0.5: 84.8
Video-mAP 0.2: 88.8
Video-mAP 0.5: 74.3
action-recognition-on-ava-v2-2HIT
mAP: 32.6

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp