HyperAIHyperAI

HiDream-E1.1: Command-based Image Editor

1. Tutorial Introduction

Build

The HiDream-E1.1 model is an open-source image editing model released by HiDream.ai in July 2025. Based on its proprietary Sparse Diffusion Transformer architecture, it supports megapixel resolution and is licensed under the MIT open source license. This model implements "comment what you say" natural language image editing capabilities, allowing users to perform complex tasks such as color adjustment, style transfer, and element addition and subtraction through simple language commands without requiring specialized software skills.

This tutorial uses dual-GPU A6000 computing resources and supports Chinese, English, French, and other languages.

2. Project Examples

3. Operation steps

1. Start the container

2. After entering the webpage, you can use the model

If "Bad Gateway" is displayed, this means the model is initializing. Due to the large size of the model, please wait for about 5-6 minutes before refreshing the page. Image processing takes a long time, approximately 5-6 minutes, so please be patient.

4. Discussion

🖌️ If you see a high-quality project, please leave a message in the background to recommend it! In addition, we have also established a tutorial exchange group. Welcome friends to scan the QR code and remark [SD Tutorial] to join the group to discuss various technical issues and share application effects↓

Citation Information

The citation information for this project is as follows:

@InProceedings{fastvlm2025,
  author = {Pavan Kumar Anasosalu Vasu, Fartash Faghri, Chun-Liang Li, Cem Koc, Nate True, Albert Antony, Gokul Santhanam, James Gabriel, Peter Grasch, Oncel Tuzel, Hadi Pouransari},
  title = {FastVLM: Efficient Vision Encoding for Vision Language Models},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  month = {June},
  year = {2025},
}