Wav2Seq (from HuBERT-large) | 65.4 | Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages | |
W2V2-L-LL60K (pipeline approach, uses LM) | 69.6 | SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech | |
W2V2-B-LS960 (pipeline approach) | 49.5 | SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech | |
HuBERT-B-LS960 (e2e approach, uses LM) | 61.9 | SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech | |
W2V2-B-LS960 (e2e approach) | 50.2 | SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech | |
W2V2-L-LL60K (e2e approach, uses LM) | 64.8 | SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech | |
W2V2-L-LL60K (pipeline approach) | 57.8 | SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech | |
W2V2-B-VP100K (e2e approach, uses LM) | 61.8 | SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech | |
HuBERT-B-LS960 (e2e approach) | 49.8 | SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech | |
W2V2-B-LS960 (pipeline approach, uses LM) | 68.0 | SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech | |
W2V2-L-LL60K (e2e approach) | 50.9 | SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech | |
W2V2-B-VP100K (e2e approach) | 47.9 | SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech | |
W2V2-B-LS960 (e2e approach, uses LM) | 63.4 | SLUE: New Benchmark Tasks for Spoken Language Understanding Evaluation on Natural Speech | |