5 months ago

Null-text Inversion for Editing Real Images using Guided Diffusion Models

Ron Mokady; Amir Hertz; Kfir Aberman; Yael Pritch; Daniel Cohen-Or

Abstract

Recent text-guided diffusion models provide powerful image generation capabilities. Currently, a massive effort is given to enable the modification of these images using text only as means to offer intuitive and versatile editing. To edit a real image using these state-of-the-art tools, one must first invert the image with a meaningful text prompt into the pretrained model's domain. In this paper, we introduce an accurate inversion technique and thus facilitate an intuitive text-based modification of the image. Our proposed inversion consists of two novel key components: (i) Pivotal inversion for diffusion models. While current methods aim at mapping random noise samples to a single input image, we use a single pivotal noise vector for each timestamp and optimize around it. We demonstrate that a direct inversion is inadequate on its own, but does provide a good anchor for our optimization. (ii) NULL-text optimization, where we only modify the unconditional textual embedding that is used for classifier-free guidance, rather than the input text embedding. This allows for keeping both the model weights and the conditional embedding intact and hence enables applying prompt-based editing while avoiding the cumbersome tuning of the model's weights. Our Null-text inversion, based on the publicly available Stable Diffusion model, is extensively evaluated on a variety of images and prompt editing, showing high-fidelity editing of real images.

Code Repositories

google/prompt-to-prompt

Official

pytorch

Mentioned in GitHub

qwopqwop200/semantic-image-editing-with-null-inv

thepowerfuldeez/null-text-inversion

Mentioned in GitHub

phymhan/prompt-to-prompt

pytorch

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
text-based-image-editing-on-pie-bench	Null-Text Inversion+Prompt-to-Prompt	Background LPIPS: 60.67 Background PSNR: 27.03 CLIPSIM: 24.75 Structure Distance: 13.44

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Null-text Inversion for Editing Real Images using Guided Diffusion Models

Ron Mokady; Amir Hertz; Kfir Aberman; Yael Pritch; Daniel Cohen-Or

Abstract

Code Repositories

Benchmarks

Build AI with AI

Hyper Newsletters