HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

OpenDAS: Open-Vocabulary Domain Adaptation for 2D and 3D Segmentation

Yilmaz Gonca ; Peng Songyou ; Pollefeys Marc ; Engelmann Francis ; Blum Hermann

OpenDAS: Open-Vocabulary Domain Adaptation for 2D and 3D Segmentation

Abstract

Recently, Vision-Language Models (VLMs) have advanced segmentation techniquesby shifting from the traditional segmentation of a closed-set of predefinedobject classes to open-vocabulary segmentation (OVS), allowing users to segmentnovel classes and concepts unseen during training of the segmentation model.However, this flexibility comes with a trade-off: fully-supervised closed-setmethods still outperform OVS methods on base classes, that is on classes onwhich they have been explicitly trained. This is due to the lack ofpixel-aligned training masks for VLMs (which are trained on image-captionpairs), and the absence of domain-specific knowledge, such as autonomousdriving. Therefore, we propose the task of open-vocabulary domain adaptation toinfuse domain-specific knowledge into VLMs while preserving theiropen-vocabulary nature. By doing so, we achieve improved performance in baseand novel classes. Existing VLM adaptation methods improve performance on base(training) queries, but fail to fully preserve the open-set capabilities ofVLMs on novel queries. To address this shortcoming, we combineparameter-efficient prompt tuning with a triplet-loss-based training strategythat uses auxiliary negative queries. Notably, our approach is the onlyparameter-efficient method that consistently surpasses the original VLM onnovel classes. Our adapted VLMs can seamlessly be integrated into existing OVSpipelines, e.g., improving OVSeg by +6.0% mIoU on ADE20K for open-vocabulary 2Dsegmentation, and OpenMask3D by +4.1% AP on ScanNet++ Offices foropen-vocabulary 3D instance segmentation without other changes. The projectpage is available at https://open-das.github.io/.

Benchmarks

BenchmarkMethodologyMetrics
open-vocabulary-semantic-segmentation-on-2OVSeg + OpenDAS
mIoU: 35.8

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
OpenDAS: Open-Vocabulary Domain Adaptation for 2D and 3D Segmentation | Papers | HyperAI