HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

PDAN: Pyramid Dilated Attention Network for Action Detection

{Francois Bremond Gianpiero Francesca Lorenzo Garattoni Luca Minciullo Srijan Das Rui Dai}

PDAN: Pyramid Dilated Attention Network for Action Detection

Abstract

Handling long and complex temporal information is an important challenge for action detection tasks. This challenge is further aggravated by densely distributed actions in untrimmed videos. Previous action detection methods fail in selecting the key temporal information in long videos. To this end, we introduce the Dilated Attention Layer (DAL). Compared to the previous temporal convolution layer, DAL allocates attentional weights to local frames in the kernel, which enables it to learn better local representation across time. Furthermore, we introduce Pyramid Dilated Attention Network (PDAN) which is built upon DAL. With the help of multiple DALs with different dilation rates, PDAN can model short-term and long-term temporal relations simultaneously by focusing on local segments at the level of low and high temporal receptive fields. This property enables PDAN to handle complex temporal relations between different action instances in long untrimmed videos. To corroborate the effectiveness and robustness of our method, we evaluate it on three densely annotated, multi-label datasets: MultiTHUMOS, Charades, and Toyota Smarthome Untrimmed (TSU) dataset. PDAN is able to outperform previous state-of-the-art methods on all these datasets.

Benchmarks

BenchmarkMethodologyMetrics
action-detection-on-charadesPDAN (RGB+Flow)
mAP: 26.5
action-detection-on-multi-thumosPDAN
mAP: 47.6
action-detection-on-tsuPDAN
Frame-mAP: 32.7
temporal-action-localization-on-multithumos-1PDAN
Average mAP: 17.3

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
PDAN: Pyramid Dilated Attention Network for Action Detection | Papers | HyperAI