HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Simple and Controllable Music Generation

Copet Jade ; Kreuk Felix ; Gat Itai ; Remez Tal ; Kant David ; Synnaeve Gabriel ; Adi Yossi ; Défossez Alexandre

Simple and Controllable Music Generation

Abstract

We tackle the task of conditional music generation. We introduce MusicGen, asingle Language Model (LM) that operates over several streams of compresseddiscrete music representation, i.e., tokens. Unlike prior work, MusicGen iscomprised of a single-stage transformer LM together with efficient tokeninterleaving patterns, which eliminates the need for cascading several models,e.g., hierarchically or upsampling. Following this approach, we demonstrate howMusicGen can generate high-quality samples, both mono and stereo, while beingconditioned on textual description or melodic features, allowing bettercontrols over the generated output. We conduct extensive empirical evaluation,considering both automatic and human studies, showing the proposed approach issuperior to the evaluated baselines on a standard text-to-music benchmark.Through ablation studies, we shed light over the importance of each of thecomponents comprising MusicGen. Music samples, code, and models are availableat https://github.com/facebookresearch/audiocraft

Benchmarks

BenchmarkMethodologyMetrics
text-to-music-generation-on-musiccapsMusicGen w/ random melody (1.5B)
FAD: 5.0
KL_passt: 1.31
text-to-music-generation-on-musiccapsMusicGen w/o melody (3.3B)
FAD: 3.8
FD_openl3: 197.12
KL_passt: 1.31
text-to-music-generation-on-musiccapsMusicGen w/o melody (1.5B)
FAD: 3.4
KL_passt: 1.23

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp