Command Palette
Search for a command to run...
JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models
Li Peike ; Chen Boyu ; Yao Yao ; Wang Yikai ; Wang Allen ; Wang Alex

Abstract
Music generation has attracted growing interest with the advancement of deepgenerative models. However, generating music conditioned on textualdescriptions, known as text-to-music, remains challenging due to the complexityof musical structures and high sampling rate requirements. Despite the task'ssignificance, prevailing generative models exhibit limitations in musicquality, computational efficiency, and generalization. This paper introducesJEN-1, a universal high-fidelity model for text-to-music generation. JEN-1 is adiffusion model incorporating both autoregressive and non-autoregressivetraining. Through in-context learning, JEN-1 performs various generation tasksincluding text-guided music generation, music inpainting, and continuation.Evaluations demonstrate JEN-1's superior performance over state-of-the-artmethods in text-music alignment and music quality while maintainingcomputational efficiency. Our demos are available athttps://jenmusic.ai/audio-demos
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| text-to-music-generation-on-musiccaps | JEN-1 | FAD: 2.00 KL_passt: 1.29 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.