4 months ago

Qingyan Bai Qiuyu Wang Hao Ouyang Yue Yu Hanlin Wang Wen Wang Ka Leong Cheng Shuailei Ma Yanhong Zeng Zichen Liu

Abstract

Instruction-based video editing promises to democratize content creation, yetits progress is severely hampered by the scarcity of large-scale, high-qualitytraining data. We introduce Ditto, a holistic framework designed to tackle thisfundamental challenge. At its heart, Ditto features a novel data generationpipeline that fuses the creative diversity of a leading image editor with anin-context video generator, overcoming the limited scope of existing models. Tomake this process viable, our framework resolves the prohibitive cost-qualitytrade-off by employing an efficient, distilled model architecture augmented bya temporal enhancer, which simultaneously reduces computational overhead andimproves temporal coherence. Finally, to achieve full scalability, this entirepipeline is driven by an intelligent agent that crafts diverse instructions andrigorously filters the output, ensuring quality control at scale. Using thisframework, we invested over 12,000 GPU-days to build Ditto-1M, a new dataset ofone million high-fidelity video editing examples. We trained our model, Editto,on Ditto-1M with a curriculum learning strategy. The results demonstratesuperior instruction-following ability and establish a new state-of-the-art ininstruction-based video editing.

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

4 months ago

Qingyan Bai Qiuyu Wang Hao Ouyang Yue Yu Hanlin Wang Wen Wang Ka Leong Cheng Shuailei Ma Yanhong Zeng Zichen Liu

Abstract

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

4 months ago

Qingyan Bai Qiuyu Wang Hao Ouyang Yue Yu Hanlin Wang Wen Wang Ka Leong Cheng Shuailei Ma Yanhong Zeng Zichen Liu

Abstract

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset

Qingyan Bai Qiuyu Wang Hao Ouyang Yue Yu Hanlin Wang Wen Wang Ka Leong Cheng Shuailei Ma Yanhong Zeng Zichen Liu3 more

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset

Qingyan Bai Qiuyu Wang Hao Ouyang Yue Yu Hanlin Wang Wen Wang Ka Leong Cheng Shuailei Ma Yanhong Zeng Zichen Liu3 more

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset

Qingyan Bai Qiuyu Wang Hao Ouyang Yue Yu Hanlin Wang Wen Wang Ka Leong Cheng Shuailei Ma Yanhong Zeng Zichen Liu3 more

Abstract

Build AI with AI

HyperAI Newsletters

Qingyan Bai Qiuyu Wang Hao Ouyang Yue Yu Hanlin Wang Wen Wang Ka Leong Cheng Shuailei Ma Yanhong Zeng Zichen Liu

Qingyan Bai Qiuyu Wang Hao Ouyang Yue Yu Hanlin Wang Wen Wang Ka Leong Cheng Shuailei Ma Yanhong Zeng Zichen Liu

Qingyan Bai Qiuyu Wang Hao Ouyang Yue Yu Hanlin Wang Wen Wang Ka Leong Cheng Shuailei Ma Yanhong Zeng Zichen Liu