HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Sandglasset: A Light Multi-Granularity Self-attentive Network For Time-Domain Speech Separation

Max W. Y. Lam Jun Wang Dan Su Dong Yu

Sandglasset: A Light Multi-Granularity Self-attentive Network For Time-Domain Speech Separation

Abstract

One of the leading single-channel speech separation (SS) models is based on a TasNet with a dual-path segmentation technique, where the size of each segment remains unchanged throughout all layers. In contrast, our key finding is that multi-granularity features are essential for enhancing contextual modeling and computational efficiency. We introduce a self-attentive network with a novel sandglass-shape, namely Sandglasset, which advances the state-of-the-art (SOTA) SS performance at significantly smaller model size and computational cost. Forward along each block inside Sandglasset, the temporal granularity of the features gradually becomes coarser until reaching half of the network blocks, and then successively turns finer towards the raw signal level. We also unfold that residual connections between features with the same granularity are critical for preserving information after passing through the bottleneck layer. Experiments show our Sandglasset with only 2.3M parameters has achieved the best results on two benchmark SS datasets -- WSJ0-2mix and WSJ0-3mix, where the SI-SNRi scores have been improved by absolute 0.8 dB and 2.4 dB, respectively, comparing to the prior SOTA results.

Benchmarks

BenchmarkMethodologyMetrics
speech-separation-on-wsj0-2mixSandglasset
SI-SDRi: 21.0
speech-separation-on-wsj0-3mixSandglasset
SI-SDRi: 17.1

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp