Speech Recognition On Librispeech Test Other

评估指标

Word Error Rate (WER)

评测结果

各个模型在此基准测试上的表现结果

		Paper Title	Repository
Local Prior Matching (Large Model)	20.84	Semi-Supervised Speech Recognition via Local Prior Matching
Snips	16.5	Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces
Local Prior Matching (Large Model, ConvLM LM)	15.28	Semi-Supervised Speech Recognition via Local Prior Matching
Deep Speech 2	13.25	Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
TDNN + pNorm + speed up/down speech	12.5	-	-
CTC-CRF 4gram-LM	10.65	CRF-based Single-stage Acoustic Modeling with CTC Topology	-
Convolutional Speech Recognition	10.47	Fully Convolutional Speech Recognition	-
MT4SSL	9.6	MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets
Jasper DR 10x5	8.79	Jasper: An End-to-End Convolutional Neural Acoustic Model
Espresso	8.7	Espresso: A Fast End-to-end Neural Speech Recognition Toolkit
Jasper DR 10x5 (+ Time/Freq Masks)	7.84	Jasper: An End-to-End Convolutional Neural Acoustic Model
tdnn + chain + rnnlm rescoring	7.63	Neural Network Language Modeling with Letter-based Features and Importance Sampling	-
QuartzNet15x5	7.25	QuartzNet: Deep Automatic Speech Recognition with 1D Time-Channel Separable Convolutions
Conformer with Relaxed Attention	6.85	Relaxed Attention: A Simple Method to Boost Performance of End-to-End Automatic Speech Recognition
LAS (no LM)	6.5	SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
Squeezeformer (L)	5.97	Squeezeformer: An Efficient Transformer for Automatic Speech Recognition
LAS + SpecAugment	5.8	SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
Multi-Stream Self-Attention With Dilated 1D Convolutions	5.80	State-of-the-Art Speech Recognition Using Multi-Stream Self-Attention With Dilated 1D Convolutions
Transformer	5.7	A Comparative Study on Transformer vs RNN in Speech Applications
LSTM Transducer	5.6	Librispeech Transducer Model with Internal Language Model Prior Correction

0 of 53 row(s) selected.

Command Palette

Speech Recognition On Librispeech Test Other

评估指标

评测结果