Command Palette
Search for a command to run...
Music Source Separation Based on a Lightweight Deep Learning Framework (DTTNET: DUAL-PATH TFC-TDF UNET)
Junyu Chen Susmitha Vekkot Pancham Shukla

Abstract
Music source separation (MSS) aims to extract 'vocals', 'drums', 'bass' and 'other' tracks from a piece of mixed music. While deep learning methods have shown impressive results, there is a trend toward larger models. In our paper, we introduce a novel and lightweight architecture called DTTNet, which is based on Dual-Path Module and Time-Frequency Convolutions Time-Distributed Fully-connected UNet (TFC-TDF UNet). DTTNet achieves 10.12 dB cSDR on 'vocals' compared to 10.01 dB reported for Bandsplit RNN (BSRNN) but with 86.7% fewer parameters. We also assess pattern-specific performance and model generalization for intricate audio patterns.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| music-source-separation-on-musdb18-hq | Dual-Path TFC-TDF UNet (DTTNet) | SDR (avg): 8.15 SDR (bass): 7.55 SDR (drums): 7.82 SDR (others): 7.02 SDR (vocals): 10.21 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.