HyperAI

Speech Separation On Wsj0 2Mix

Metrics

Number of parameters (M)
SDRi
SI-SDRi

Results

Performance results of various models on this benchmark

Model Name
Number of parameters (M)
SDRi
SI-SDRi
Paper TitleRepository
TF-Locoformer (S) + DM5.02322.8TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement
TD-Conformer (XL) + DM--21.2On Time Domain Conformer Models for Monaural Speech Separation in Noisy Reverberant Acoustic Environments
Conv-TasNet5.1-15.3Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation
Two-step Conv-TasNet--16.1Two-Step Sound Source Separation: Training on Learned Latent Targets
TF-Locoformer (M)15.023.823.6TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement
DPTNet--20.2Dual-Path Transformer Network: Direct Context-Aware Modeling for End-to-End Monaural Speech Separation
SepReformer-L59.425.225.1Separate and Reconstruct: Asymmetric Encoder-Decoder for Speech Separation
Wavesplit v2-22.322.2Wavesplit: End-to-End Speech Separation by Speaker Clustering-
Deformable TCN + Dynamic Mixing3.617.417.2Deformable Temporal Convolutional Networks for Monaural Noisy Reverberant Speech Separation
Separate And Diffuse--23.9Separate And Diffuse: Using a Pretrained Diffusion Model for Improving Source Separation-
TasNet--10.8TasNet: time-domain audio separation network for real-time, single-channel speech separation
Sandglasset--21.0Sandglasset: A Light Multi-Granularity Self-attentive Network For Time-Domain Speech Separation
SPGM + DM26.2-22.7SPGM: Prioritizing Local Features for enhanced speech separation performance
Deep Clustering ++--10.8Deep clustering: Discriminative embeddings for segmentation and separation
SPGM26.2-22.1SPGM: Prioritizing Local Features for enhanced speech separation performance
Sudo rm -rf (U=36)--19.5Compute and memory efficient universal sound source separation
SepTDA (L=12)--24.0Boosting Unknown-number Speaker Separation with Transformer Decoder-based Attractor-
SepIt--22.4SepIt: Approaching a Single Channel Speech Separation Bound-
MossFormer (L) + DM42.1-22.8MossFormer: Pushing the Performance Limit of Monaural Speech Separation using Gated Single-Head Transformer with Convolution-Augmented Joint Self-Attentions
Gated DualPathRNN--20.12Voice Separation with an Unknown Number of Multiple Speakers
0 of 38 row(s) selected.