HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

On Time Domain Conformer Models for Monaural Speech Separation in Noisy Reverberant Acoustic Environments

William Ravenscroft Stefan Goetze Thomas Hain

On Time Domain Conformer Models for Monaural Speech Separation in Noisy Reverberant Acoustic Environments

Abstract

Speech separation remains an important topic for multi-speaker technology researchers. Convolution augmented transformers (conformers) have performed well for many speech processing tasks but have been under-researched for speech separation. Most recent state-of-the-art (SOTA) separation models have been time-domain audio separation networks (TasNets). A number of successful models have made use of dual-path (DP) networks which sequentially process local and global information. Time domain conformers (TD-Conformers) are an analogue of the DP approach in that they also process local and global context sequentially but have a different time complexity function. It is shown that for realistic shorter signal lengths, conformers are more efficient when controlling for feature dimension. Subsampling layers are proposed to further improve computational efficiency. The best TD-Conformer achieves 14.6 dB and 21.2 dB SISDR improvement on the WHAMR and WSJ0-2Mix benchmarks, respectively.

Code Repositories

jwr1995/pubsep
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
speech-separation-on-whamrTD-Confomer (S)
SI-SDRi: 10.5
speech-separation-on-whamrTD-Conformer (L) + DM
SI-SDRi: 13.4
speech-separation-on-whamrTD-Conformer (XL) + DM
SI-SDRi: 14.6
speech-separation-on-whamrTD-Confomer (M) + DM
SI-SDRi: 12
speech-separation-on-wsj0-2mixTD-Conformer (XL) + DM
SI-SDRi: 21.2

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp