Date

5 months ago

Size

391.24 MB

1. Tutorial Introduction

The HiDream-E1.1 model is an open-source image editing model released by HiDream.ai in July 2025. Based on its proprietary Sparse Diffusion Transformer architecture, it supports megapixel resolution and is licensed under the MIT open source license. This model implements "comment what you say" natural language image editing capabilities, allowing users to perform complex tasks such as color adjustment, style transfer, and element addition and subtraction through simple language commands without requiring specialized software skills.

This tutorial uses dual-GPU A6000 computing resources and supports Chinese, English, French, and other languages.

2. Project Examples

3. Operation steps

1. Start the container

2. After entering the webpage, you can use the model

If "Bad Gateway" is displayed, this means the model is initializing. Due to the large size of the model, please wait for about 5-6 minutes before refreshing the page. Image processing takes a long time, approximately 5-6 minutes, so please be patient.

4. Discussion

🖌️ If you see a high-quality project, please leave a message in the background to recommend it! In addition, we have also established a tutorial exchange group. Welcome friends to scan the QR code and remark [SD Tutorial] to join the group to discuss various technical issues and share application effects↓

Citation Information

The citation information for this project is as follows:

@InProceedings{fastvlm2025,
  author = {Pavan Kumar Anasosalu Vasu, Fartash Faghri, Chun-Liang Li, Cem Koc, Nate True, Albert Antony, Gokul Santhanam, James Gabriel, Peter Grasch, Oncel Tuzel, Hadi Pouransari},
  title = {FastVLM: Efficient Vision Encoding for Vision Language Models},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  month = {June},
  year = {2025},
}

This notebook is contributed by community users and is intended for educational and informational purposes only. If any content involves copyright infringement, please contact us at support@hyper.ai for prompt review and removal.

Related Notebooks

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

Run this Notebook

Date

5 months ago

Size

391.24 MB

1. Tutorial Introduction

This tutorial uses dual-GPU A6000 computing resources and supports Chinese, English, French, and other languages.

2. Project Examples

3. Operation steps

1. Start the container

2. After entering the webpage, you can use the model

If "Bad Gateway" is displayed, this means the model is initializing. Due to the large size of the model, please wait for about 5-6 minutes before refreshing the page. Image processing takes a long time, approximately 5-6 minutes, so please be patient.

4. Discussion

Citation Information

The citation information for this project is as follows:

@InProceedings{fastvlm2025,
  author = {Pavan Kumar Anasosalu Vasu, Fartash Faghri, Chun-Liang Li, Cem Koc, Nate True, Albert Antony, Gokul Santhanam, James Gabriel, Peter Grasch, Oncel Tuzel, Hadi Pouransari},
  title = {FastVLM: Efficient Vision Encoding for Vision Language Models},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  month = {June},
  year = {2025},
}

Related Notebooks

Z-Image-Turbo: A High-Efficiency 6B-Parameter Image Generation Model

2 months ago

Ovis-Image: High-quality Image Generation Model

2 months ago

PaddleOCR-VL: Multimodal Document Parsing

3 months ago

Krea-realtime-video: Real-time Video Generation Model

2 months ago

Depth-Anything-3: Restoring Visual Space From Any Perspective

2 months ago

Kiss3DGen: A 3D Asset Generation Framework Based on an Image Diffusion Model

a month ago

HunyuanOCR: Tencent Hunyuan End-to-End OCR

2 months ago

Tencent HunyuanVideo-Foley

a month ago

ROCKET-2: 3D Game Zero-Shot Transfer

2 months ago

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

HiDream-E1.1: Command-based Image Editor

1. Tutorial Introduction

2. Project Examples

3. Operation steps

1. Start the container

2. After entering the webpage, you can use the model

4. Discussion

Citation Information

Build AI with AI

HyperAI Newsletters

Command Palette

HiDream-E1.1: Command-based Image Editor

1. Tutorial Introduction

2. Project Examples

3. Operation steps

1. Start the container

2. After entering the webpage, you can use the model

4. Discussion

Citation Information

Related Notebooks

Z-Image-Turbo: A High-Efficiency 6B-Parameter Image Generation Model

Ovis-Image: High-quality Image Generation Model

PaddleOCR-VL: Multimodal Document Parsing

Krea-realtime-video: Real-time Video Generation Model

Depth-Anything-3: Restoring Visual Space From Any Perspective

Kiss3DGen: A 3D Asset Generation Framework Based on an Image Diffusion Model

HunyuanOCR: Tencent Hunyuan End-to-End OCR

Tencent HunyuanVideo-Foley

ROCKET-2: 3D Game Zero-Shot Transfer

Build AI with AI

HyperAI Newsletters

Command Palette

HiDream-E1.1: Command-based Image Editor

1. Tutorial Introduction

2. Project Examples

3. Operation steps

1. Start the container

2. After entering the webpage, you can use the model

4. Discussion

Citation Information

Related Notebooks

Z-Image-Turbo: A High-Efficiency 6B-Parameter Image Generation Model

Ovis-Image: High-quality Image Generation Model

PaddleOCR-VL: Multimodal Document Parsing

Krea-realtime-video: Real-time Video Generation Model

Depth-Anything-3: Restoring Visual Space From Any Perspective

Kiss3DGen: A 3D Asset Generation Framework Based on an Image Diffusion Model

HunyuanOCR: Tencent Hunyuan End-to-End OCR

Tencent HunyuanVideo-Foley

ROCKET-2: 3D Game Zero-Shot Transfer

Build AI with AI

HyperAI Newsletters

Related Notebooks

Z-Image-Turbo: A High-Efficiency 6B-Parameter Image Generation Model

Ovis-Image: High-quality Image Generation Model

PaddleOCR-VL: Multimodal Document Parsing

Krea-realtime-video: Real-time Video Generation Model

Depth-Anything-3: Restoring Visual Space From Any Perspective

Kiss3DGen: A 3D Asset Generation Framework Based on an Image Diffusion Model

HunyuanOCR: Tencent Hunyuan End-to-End OCR

Tencent HunyuanVideo-Foley

ROCKET-2: 3D Game Zero-Shot Transfer

Related Notebooks

Z-Image-Turbo: A High-Efficiency 6B-Parameter Image Generation Model

Ovis-Image: High-quality Image Generation Model

PaddleOCR-VL: Multimodal Document Parsing

Krea-realtime-video: Real-time Video Generation Model

Depth-Anything-3: Restoring Visual Space From Any Perspective

Kiss3DGen: A 3D Asset Generation Framework Based on an Image Diffusion Model

HunyuanOCR: Tencent Hunyuan End-to-End OCR

Tencent HunyuanVideo-Foley

ROCKET-2: 3D Game Zero-Shot Transfer