Command Palette
Search for a command to run...
Gao Mingfei ; Zhou Yingbo ; Xu Ran ; Socher Richard ; Xiong Caiming

Abstract
Online action detection in untrimmed videos aims to identify an action as ithappens, which makes it very important for real-time applications. Previousmethods rely on tedious annotations of temporal action boundaries for training,which hinders the scalability of online action detection systems. We proposeWOAD, a weakly supervised framework that can be trained using only video-classlabels. WOAD contains two jointly-trained modules, i.e., temporal proposalgenerator (TPG) and online action recognizer (OAR). Supervised by video-classlabels, TPG works offline and targets at accurately mining pseudo frame-levellabels for OAR. With the supervisory signals from TPG, OAR learns to conductaction detection in an online fashion. Experimental results on THUMOS'14,ActivityNet1.2 and ActivityNet1.3 show that our weakly-supervised methodlargely outperforms weakly-supervised baselines and achieves comparableperformance to the previous strongly-supervised methods. Beyond that, WOAD isflexible to leverage strong supervision when it is available. When stronglysupervised, our method obtains the state-of-the-art results in the tasks ofboth online per-frame action recognition and online detection of action start.
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| online-action-detection-on-thumos-14 | WOAD | mAP: 67.1 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.