HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Learning What to Learn for Video Object Segmentation

Goutam Bhat; Felix Järemo Lawin; Martin Danelljan; Andreas Robinson; Michael Felsberg; Luc Van Gool; Radu Timofte

Learning What to Learn for Video Object Segmentation

Abstract

Video object segmentation (VOS) is a highly challenging problem, since the target object is only defined during inference with a given first-frame reference mask. The problem of how to capture and utilize this limited target information remains a fundamental research question. We address this by introducing an end-to-end trainable VOS architecture that integrates a differentiable few-shot learning module. This internal learner is designed to predict a powerful parametric model of the target by minimizing a segmentation error in the first frame. We further go beyond standard few-shot learning techniques by learning what the few-shot learner should learn. This allows us to achieve a rich internal representation of the target in the current frame, significantly increasing the segmentation accuracy of our approach. We perform extensive experiments on multiple benchmarks. Our approach sets a new state-of-the-art on the large-scale YouTube-VOS 2018 dataset by achieving an overall score of 81.5, corresponding to a 2.6% relative improvement over the previous best result.

Code Repositories

visionml/pytracking
Official
pytorch
Mentioned in GitHub
maoyunyao/joint
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
semi-supervised-video-object-segmentation-on-20LWL
D17 val (F): 76.3
D17 val (G): 74.3
D17 val (J): 72.2
FPS: 14.0

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp