HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

Temporal Convolutional Networks for Action Segmentation and Detection

Colin Lea; Michael D. Flynn; Rene Vidal; Austin Reiter; Gregory D. Hager

Temporal Convolutional Networks for Action Segmentation and Detection

Abstract

The ability to identify and temporally segment fine-grained human actions throughout a video is crucial for robotics, surveillance, education, and beyond. Typical approaches decouple this problem by first extracting local spatiotemporal features from video frames and then feeding them into a temporal classifier that captures high-level temporal patterns. We introduce a new class of temporal models, which we call Temporal Convolutional Networks (TCNs), that use a hierarchy of temporal convolutions to perform fine-grained action segmentation or detection. Our Encoder-Decoder TCN uses pooling and upsampling to efficiently capture long-range temporal patterns whereas our Dilated TCN uses dilated convolutions. We show that TCNs are capable of capturing action compositions, segment durations, and long-range dependencies, and are over a magnitude faster to train than competing LSTM-based Recurrent Neural Networks. We apply these models to three challenging fine-grained datasets and show large improvements over the state of the art.

Benchmarks

BenchmarkMethodologyMetrics
action-segmentation-on-gtea-1ED-TCN
Acc: 64.0
Edit: -
F1@10%: 72.2
F1@25%: 69.3
F1@50%: 56.0
skeleton-based-action-recognition-on-varyingTCN
Accuracy (AV I): 43%
Accuracy (AV II): 64%
Accuracy (CS): 56%
Accuracy (CV I): 16%
Accuracy (CV II): 43%

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp