8 months ago

Computer Vision

Video Understanding

Semantic Segmentation

Computer Vision

Viktor Varga András Lőrincz

Abstract

Pixelwise annotation of image sequences can be very tedious for humans.Interactive video object segmentation aims to utilize automatic methods tospeed up the process and reduce the workload of the annotators. Mostcontemporary approaches rely on deep convolutional networks to collect andprocess information from human annotations throughout the video. However, suchnetworks contain millions of parameters and need huge amounts of labeledtraining data to avoid overfitting. Beyond that, label propagation is usuallyexecuted as a series of frame-by-frame inference steps, which is difficult tobe parallelized and is thus time consuming. In this paper we present a graphneural network based approach for tackling the problem of interactive videoobject segmentation. Our network operates on superpixel-graphs which allow usto reduce the dimensionality of the problem by several magnitudes. We show,that our network possessing only a few thousand parameters is able to achievestate-of-the-art performance, while inference remains fast and can be trainedquickly with very little data.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp

8 months ago

Computer Vision

Video Understanding

Semantic Segmentation

Computer Vision

Viktor Varga András Lőrincz

Abstract

Pixelwise annotation of image sequences can be very tedious for humans.Interactive video object segmentation aims to utilize automatic methods tospeed up the process and reduce the workload of the annotators. Mostcontemporary approaches rely on deep convolutional networks to collect andprocess information from human annotations throughout the video. However, suchnetworks contain millions of parameters and need huge amounts of labeledtraining data to avoid overfitting. Beyond that, label propagation is usuallyexecuted as a series of frame-by-frame inference steps, which is difficult tobe parallelized and is thus time consuming. In this paper we present a graphneural network based approach for tackling the problem of interactive videoobject segmentation. Our network operates on superpixel-graphs which allow usto reduce the dimensionality of the problem by several magnitudes. We show,that our network possessing only a few thousand parameters is able to achievestate-of-the-art performance, while inference remains fast and can be trainedquickly with very little data.

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Powered by MailChimp