HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Efficient Two-Step Networks for Temporal Action Segmentation

{Shenglan Liu YuHan Wang Li Xu Jie Zhu Lianyu Hu Lin Feng Kaiyuan Liu Zhuben Dong Yunheng Li}

Abstract

Due to boundary ambiguity and over-segmentation issues, identifying all the frames in long untrimmed videos is still challenging. To address these problems, we present the Efficient Two-Step Network (ETSN) with two components. The first step of ETSN is Efficient Temporal Series Pyramid Networks (ETSPNet) that capture both local and global frame-level features and provide accurate predictions of segmentation boundaries. The second step is a novel unsupervised approach called Local Burr Suppression (LBS), which significantly reduces the over-segmentation errors. Our empirical evaluations on the benchmarks including 50Salads, GTEA and Breakfast dataset demonstrate that ETSN outperforms the current state-of-the-art methods by a large margin.

Benchmarks

BenchmarkMethodologyMetrics
action-segmentation-on-50-salads-1ETSN
Acc: 82.0
Edit: 78.8
F1@10%: 85.2
F1@25%: 83.9
F1@50%: 75.4
action-segmentation-on-breakfast-1ETSN
Acc: 67.8
Average F1: 66.4
Edit: 70.3
F1@10%: 74.0
F1@25%: 69.0
F1@50%: 56.2
action-segmentation-on-gtea-1ETSN
Acc: 78.2
Edit: 86.2
F1@10%: 91.1
F1@25%: 90.0
F1@50%: 77.9

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp