HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing

Mingdeng Cao; Xintao Wang; Zhongang Qi; Ying Shan; Xiaohu Qie; Yinqiang Zheng

MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing

Abstract

Despite the success in large-scale text-to-image generation and text-conditioned image editing, existing methods still struggle to produce consistent generation and editing results. For example, generation approaches usually fail to synthesize multiple images of the same objects/characters but with different views or poses. Meanwhile, existing editing methods either fail to achieve effective complex non-rigid editing while maintaining the overall textures and identity, or require time-consuming fine-tuning to capture the image-specific appearance. In this paper, we develop MasaCtrl, a tuning-free method to achieve consistent image generation and complex non-rigid image editing simultaneously. Specifically, MasaCtrl converts existing self-attention in diffusion models into mutual self-attention, so that it can query correlated local contents and textures from source images for consistency. To further alleviate the query confusion between foreground and background, we propose a mask-guided mutual self-attention strategy, where the mask can be easily extracted from the cross-attention maps. Extensive experiments show that the proposed MasaCtrl can produce impressive results in both consistent image generation and complex non-rigid real image editing.

Code Repositories

tencentarc/masactrl
Official
pytorch
Mentioned in GitHub
hansam95/nmg
pytorch
Mentioned in GitHub
thu-cvml/texturediffusion
pytorch
Mentioned in GitHub
phymhan/prompt-to-prompt
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
text-based-image-editing-on-pie-benchDDIM Inversion+MasaCtrl
Background LPIPS: 106.62
Background PSNR: 22.17
CLIPSIM: 23.96
Structure Distance: 28.38

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp