HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Retrieval-Augmented Open-Vocabulary Object Detection

Jooyeon Kim Eulrang Cho Sehyung Kim Hyunwoo J. Kim

Retrieval-Augmented Open-Vocabulary Object Detection

Abstract

Open-vocabulary object detection (OVD) has been studied with Vision-Language Models (VLMs) to detect novel objects beyond the pre-trained categories. Previous approaches improve the generalization ability to expand the knowledge of the detector, using 'positive' pseudo-labels with additional 'class' names, e.g., sock, iPod, and alligator. To extend the previous methods in two aspects, we propose Retrieval-Augmented Losses and visual Features (RALF). Our method retrieves related 'negative' classes and augments loss functions. Also, visual features are augmented with 'verbalized concepts' of classes, e.g., worn on the feet, handheld music player, and sharp teeth. Specifically, RALF consists of two modules: Retrieval Augmented Losses (RAL) and Retrieval-Augmented visual Features (RAF). RAL constitutes two losses reflecting the semantic similarity with negative vocabularies. In addition, RAF augments visual features with the verbalized concepts from a large language model (LLM). Our experiments demonstrate the effectiveness of RALF on COCO and LVIS benchmark datasets. We achieve improvement up to 3.4 box AP${50}^{\text{N}}$ on novel categories of the COCO dataset and 3.6 mask AP${\text{r}}$ gains on the LVIS dataset. Code is available at https://github.com/mlvlab/RALF .

Code Repositories

mlvlab/RALF
Official
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
open-vocabulary-object-detection-on-lvis-v1-0RALF
AP novel-LVIS base training: 21.9
open-vocabulary-object-detection-on-mscocoRALF
AP 0.5: 41.3

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp