HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Contrastive Learning based Deep Latent Masking for Music Source Separation

{Hong-Goo Kang Jihyun Kim}

Contrastive Learning based Deep Latent Masking for Music Source Separation

Abstract

Recent studies on music source separation have extended their applicability to generic audio signals. Real-time applications for music source separation are necessary to provide services such as custom equalizers or to improve the sound of live streaming with diverse effects. However, most prior methods are unsuitable for real-time applications due to their high computational complexity, large memory usage, or long latency. To overcome these problems, we propose a Wave-U-Net type of music source separation network that utilizes high-dimensional masking for the deep latent domain features. We also introduce a contrastive learning technique to estimate the salient latent space embedding of each target source using a masking-based approach. The performance of our proposed model is evaluated on the MUSDB18HQ dataset in comparison with several baselines. The experiments confirm that our proposed model is capable of real-time processing and outperforms existing models.

Benchmarks

BenchmarkMethodologyMetrics
music-source-separation-on-musdb18DLMNet
SDR (avg): 6.47
SDR (bass): 7.29
SDR (drums): 7.05
SDR (other): 4.62
SDR (vocals): 6.91

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Contrastive Learning based Deep Latent Masking for Music Source Separation | Papers | HyperAI