Command Palette
Search for a command to run...
Bar Amir ; Gandelsman Yossi ; Darrell Trevor ; Globerson Amir ; Efros Alexei A.

Abstract
How does one adapt a pre-trained visual model to novel downstream taskswithout task-specific finetuning or any model modification? Inspired byprompting in NLP, this paper investigates visual prompting: given input-outputimage example(s) of a new task at test time and a new input image, the goal isto automatically produce the output image, consistent with the given examples.We show that posing this problem as simple image inpainting - literally justfilling in a hole in a concatenated visual prompt image - turns out to besurprisingly effective, provided that the inpainting algorithm has been trainedon the right data. We train masked auto-encoders on a new dataset that wecurated - 88k unlabeled figures from academic papers sources on Arxiv. We applyvisual prompting to these pretrained models and demonstrate results on variousdownstream image-to-image tasks, including foreground segmentation, singleobject detection, colorization, edge detection, etc.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| personalized-segmentation-on-perseg | Visual Prompting | mIoU: 65.88 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.