Command Palette
Search for a command to run...
Alleviating Distortion in Image Generation via Multi-Resolution Diffusion Models
Qihao Liu Zhanpeng Zeng Ju He Qihang Yu Xiaohui Shen Liang-Chieh Chen

Abstract
This paper presents innovative enhancements to diffusion models byintegrating a novel multi-resolution network and time-dependent layernormalization. Diffusion models have gained prominence for their effectivenessin high-fidelity image generation. While conventional approaches rely onconvolutional U-Net architectures, recent Transformer-based designs havedemonstrated superior performance and scalability. However, Transformerarchitectures, which tokenize input data (via "patchification"), face atrade-off between visual fidelity and computational complexity due to thequadratic nature of self-attention operations concerning token length. Whilelarger patch sizes enable attention computation efficiency, they struggle tocapture fine-grained visual details, leading to image distortions. To addressthis challenge, we propose augmenting the Diffusion model with theMulti-Resolution network (DiMR), a framework that refines features acrossmultiple resolutions, progressively enhancing detail from low to highresolution. Additionally, we introduce Time-Dependent Layer Normalization(TD-LN), a parameter-efficient approach that incorporates time-dependentparameters into layer normalization to inject time information and achievesuperior performance. Our method's efficacy is demonstrated on theclass-conditional ImageNet generation benchmark, where DiMR-XL variantsoutperform prior diffusion models, setting new state-of-the-art FID scores of1.70 on ImageNet 256 x 256 and 2.89 on ImageNet 512 x 512. Project page:https://qihao067.github.io/projects/DiMR
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| image-generation-on-imagenet-256x256 | DiMR-G/2R | FID: 1.63 |
| image-generation-on-imagenet-256x256 | DiMR-XL/2R | FID: 1.70 |
| image-generation-on-imagenet-512x512 | DiMR-XL/3R | FID: 2.89 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.