HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

CoSMo: Content-Style Modulation for Image Retrieval With Text Feedback

{Bohyung Han Dongwan Kim Seungmin Lee}

CoSMo: Content-Style Modulation for Image Retrieval With Text Feedback

Abstract

We tackle the task of image retrieval with text feedback, where a reference image and modifier text are combined to identify the desired target image. We focus on designing an image-text compositor, i.e., integrating multi-modal inputs to produce a representation similar to that of the target image. In our algorithm, Content-Style Modulation (CoSMo), we approach this challenge by introducing two modules based on deep neural networks: the content and style modulators. The content modulator performs local updates to the reference image feature after normalizing the style of the image, where a disentangled multi-modal non-local block is employed to achieve the desired content modifications. Then, the style modulator reintroduces global style information to the updated feature. We provide an in-depth view of our algorithm and its design choices, and show that it accomplishes outstanding performance on multiple image-text retrieval benchmarks. Our code can be found at: https://github.com/postBG/CosMo.pytorch

Benchmarks

BenchmarkMethodologyMetrics
image-retrieval-on-fashion-iqCoSMo
(Recall@10+Recall@50)/2: 39.45

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
CoSMo: Content-Style Modulation for Image Retrieval With Text Feedback | Papers | HyperAI