HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Polar Relative Positional Encoding for Video-Language Segmentation

{Qi Tian Fei Wu Lingxi Xie Ke Ning}

Polar Relative Positional Encoding for Video-Language Segmentation

Abstract

In this paper, we tackle a challenging task named video-language segmentation. Given a video and a sentence in natural language, the goal is to segment the object or actor described by the sentence in video frames. To accurately denote a target object, the given sentence usually refers to multiple attributes, such as nearby objects with spatial relations, etc. In this paper, we propose a novel Polar Relative Positional Encoding (PRPE) mechanism that represents spatial relations in a ``linguistic'' way, i.e., in terms of direction and range. Sentence feature can interact with positional embeddings in a more direct way to extract the implied relative positional relations. We also propose parameterized functions for these positional embeddings to adapt real-value directions and ranges. With PRPE, we design a Polar Attention Module (PAM) as the basic module for vision-language fusion. Our method outperforms previous best method by a large margin of 11.4% absolute improvement in terms of mAP on the challenging A2D Sentences dataset. Our method also achieves competitive performances on the J-HMDB Sentences dataset.

Benchmarks

BenchmarkMethodologyMetrics
referring-expression-segmentation-on-a2dPRPE
AP: 0.388
IoU mean: 0.529
IoU overall: 0.661
Precision@0.5: 0.634
Precision@0.6: 0.579
Precision@0.7: 0.483
Precision@0.8: 0.322
Precision@0.9: 0.083
referring-expression-segmentation-on-j-hmdbPRPE
AP: 0.294
Precision@0.5: 0.572
Precision@0.6: 0.690
Precision@0.7: 0.319
Precision@0.8: 0.06
Precision@0.9: 0.001

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp