Command Palette
Search for a command to run...
ST-HOI: A Spatial-Temporal Baseline for Human-Object Interaction Detection in Videos
Chiou Meng-Jiun ; Liao Chun-Yu ; Wang Li-Wei ; Zimmermann Roger ; Feng Jiashi

Abstract
Detecting human-object interactions (HOI) is an important step toward acomprehensive visual understanding of machines. While detecting non-temporalHOIs (e.g., sitting on a chair) from static images is feasible, it is unlikelyeven for humans to guess temporal-related HOIs (e.g., opening/closing a door)from a single video frame, where the neighboring frames play an essential role.However, conventional HOI methods operating on only static images have beenused to predict temporal-related interactions, which is essentially guessingwithout temporal contexts and may lead to sub-optimal performance. In thispaper, we bridge this gap by detecting video-based HOIs with explicit temporalinformation. We first show that a naive temporal-aware variant of a commonaction detection baseline does not work on video-based HOIs due to afeature-inconsistency issue. We then propose a simple yet effectivearchitecture named Spatial-Temporal HOI Detection (ST-HOI) utilizing temporalinformation such as human and object trajectories, correctly-localized visualfeatures, and spatial-temporal masking pose features. We construct a new videoHOI benchmark dubbed VidHOI where our proposed approach serves as a solidbaseline.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| human-object-interaction-anticipation-on | STTRAN | Person-wise Top5: t=1(mAP@0.5): 29.09 Person-wise Top5: t=3(mAP@0.5): 27.59 Person-wise Top5: t=5(mAP@0.5): 27.32 |
| human-object-interaction-detection-on-vidhoi | STTRAN | Detection: Full (mAP@0.5): 7.61 Detection: Non-Rare (mAP@0.5): 13.18 Detection: Rare (mAP@0.5): 3.33 Oracle: Full (mAP@0.5): 28.32 Oracle: Non-Rare (mAP@0.5): 42.08 Oracle: Rare (mAP@0.5): 17.74 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.