HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

Actor and Action Video Segmentation from a Sentence

Kirill Gavrilyuk; Amir Ghodrati; Zhenyang Li; Cees G.M. Snoek

Actor and Action Video Segmentation from a Sentence

Abstract

This paper strives for pixel-level segmentation of actors and their actions in video content. Different from existing works, which all learn to segment from a fixed vocabulary of actor and action pairs, we infer the segmentation from a natural language input sentence. This allows to distinguish between fine-grained actors in the same super-category, identify actor and action instances, and segment pairs that are outside of the actor and action vocabulary. We propose a fully-convolutional model for pixel-level actor and action segmentation using an encoder-decoder architecture optimized for video. To show the potential of actor and action video segmentation from a sentence, we extend two popular actor and action datasets with more than 7,500 natural language descriptions. Experiments demonstrate the quality of the sentence-guided segmentations, the generalization ability of our model, and its advantage for traditional actor and action segmentation compared to the state-of-the-art.

Code Repositories

JerryX1110/awesome-rvos
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
referring-expression-segmentation-on-a2dGavriluyk el al. (Optical flow)
AP: 0.215
IoU mean: 0.426
IoU overall: 0.551
Precision@0.5: 0.5
Precision@0.6: 0.376
Precision@0.7: 0.231
Precision@0.8: 0.094
Precision@0.9: 0.004
referring-expression-segmentation-on-a2dGavriluyk el al.
AP: 0.198
IoU mean: 0.421
IoU overall: 0.536
Precision@0.5: 0.475
Precision@0.6: 0.347
Precision@0.7: 0.211
Precision@0.8: 0.08
Precision@0.9: 0.002
referring-expression-segmentation-on-j-hmdbGavrilyuk et al.
AP: 0.233
IoU mean: 0.542
IoU overall: 0.541
Precision@0.5: 0.699
Precision@0.6: 0.460
Precision@0.7: 0.173
Precision@0.8: 0.014
Precision@0.9: 0.000
referring-expression-segmentation-on-j-hmdbGavrilyuk et al. (Optical flow)
AP: 0.267
IoU mean: 0.570
IoU overall: 0.555
Precision@0.5: 0.712
Precision@0.6: 0.518
Precision@0.7: 0.264
Precision@0.8: 0.030
Precision@0.9: 0.000

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp