Omni-Effects: Unified and Spatially-Controllable Visual Effects Generation

Visual effects (VFX) are essential visual enhancements fundamental to moderncinematic production. Although video generation models offer cost-efficientsolutions for VFX production, current methods are constrained by per-effectLoRA training, which limits generation to single effects. This fundamentallimitation impedes applications that require spatially controllable compositeeffects, i.e., the concurrent generation of multiple effects at designatedlocations. However, integrating diverse effects into a unified framework facesmajor challenges: interference from effect variations and spatialuncontrollability during multi-VFX joint training. To tackle these challenges,we propose Omni-Effects, a first unified framework capable of generatingprompt-guided effects and spatially controllable composite effects. The core ofour framework comprises two key innovations: (1) LoRA-based Mixture of Experts(LoRA-MoE), which employs a group of expert LoRAs, integrating diverse effectswithin a unified model while effectively mitigating cross-task interference.(2) Spatial-Aware Prompt (SAP) incorporates spatial mask information into thetext token, enabling precise spatial control. Furthermore, we introduce anIndependent-Information Flow (IIF) module integrated within the SAP, isolatingthe control signals corresponding to individual effects to prevent any unwantedblending. To facilitate this research, we construct a comprehensive VFX datasetOmni-VFX via a novel data collection pipeline combining image editing andFirst-Last Frame-to-Video (FLF2V) synthesis, and introduce a dedicated VFXevaluation framework for validating model performance. Extensive experimentsdemonstrate that Omni-Effects achieves precise spatial control and diverseeffect generation, enabling users to specify both the category and location ofdesired effects.