5 months ago

Texts as Images in Prompt Tuning for Multi-Label Image Recognition

Guo Zixian ; Dong Bowen ; Ji Zhilong ; Bai Jinfeng ; Guo Yiwen ; Zuo Wangmeng

Abstract

Prompt tuning has been employed as an efficient way to adapt largevision-language pre-trained models (e.g. CLIP) to various downstream tasks indata-limited or label-limited settings. Nonetheless, visual data (e.g., images)is by default prerequisite for learning prompts in existing methods. In thiswork, we advocate that the effectiveness of image-text contrastive learning inaligning the two modalities (for training CLIP) further makes it feasible totreat texts as images for prompt tuning and introduce TaI prompting. Incontrast to the visual data, text descriptions are easy to collect, and theirclass labels can be directly derived. Particularly, we apply TaI prompting tomulti-label image recognition, where sentences in the wild serve asalternatives to images for prompt tuning. Moreover, with TaI, double-grainedprompt tuning (TaI-DPT) is further presented to extract both coarse-grained andfine-grained embeddings for enhancing the multi-label recognition performance.Experimental results show that our proposed TaI-DPT outperforms zero-shot CLIPby a large margin on multiple benchmarks, e.g., MS-COCO, VOC2007, and NUS-WIDE,while it can be combined with existing methods of prompting from images toimprove recognition performance further. Code is released athttps://github.com/guozix/TaI-DPT.

Code Repositories

guozix/tai-dpt

Official

pytorch

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
multi-label-image-recognition-with-partial	DualCoOp+TaI-DPT	Average mAP: 83.6
multi-label-image-recognition-with-partial-1	DualCoOp+TaI-DPT	Average mAP: 94.8

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette