HyperAI
HyperAI超神经
首页
算力平台
文档
资讯
论文
教程
数据集
百科
SOTA
LLM 模型天梯
GPU 天梯
顶会
开源项目
全站搜索
关于
中文
HyperAI
HyperAI超神经
Toggle sidebar
全站搜索…
⌘
K
Command Palette
Search for a command to run...
首页
SOTA
语音识别
Speech Recognition On Librispeech Test Other
Speech Recognition On Librispeech Test Other
评估指标
Word Error Rate (WER)
评测结果
各个模型在此基准测试上的表现结果
Columns
模型名称
Word Error Rate (WER)
Paper Title
Repository
Local Prior Matching (Large Model)
20.84
Semi-Supervised Speech Recognition via Local Prior Matching
Snips
16.5
Snips Voice Platform: an embedded Spoken Language Understanding system for private-by-design voice interfaces
Local Prior Matching (Large Model, ConvLM LM)
15.28
Semi-Supervised Speech Recognition via Local Prior Matching
Deep Speech 2
13.25
Deep Speech 2: End-to-End Speech Recognition in English and Mandarin
TDNN + pNorm + speed up/down speech
12.5
-
-
CTC-CRF 4gram-LM
10.65
CRF-based Single-stage Acoustic Modeling with CTC Topology
-
Convolutional Speech Recognition
10.47
Fully Convolutional Speech Recognition
-
MT4SSL
9.6
MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets
Jasper DR 10x5
8.79
Jasper: An End-to-End Convolutional Neural Acoustic Model
Espresso
8.7
Espresso: A Fast End-to-end Neural Speech Recognition Toolkit
Jasper DR 10x5 (+ Time/Freq Masks)
7.84
Jasper: An End-to-End Convolutional Neural Acoustic Model
tdnn + chain + rnnlm rescoring
7.63
Neural Network Language Modeling with Letter-based Features and Importance Sampling
-
QuartzNet15x5
7.25
QuartzNet: Deep Automatic Speech Recognition with 1D Time-Channel Separable Convolutions
Conformer with Relaxed Attention
6.85
Relaxed Attention: A Simple Method to Boost Performance of End-to-End Automatic Speech Recognition
LAS (no LM)
6.5
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
Squeezeformer (L)
5.97
Squeezeformer: An Efficient Transformer for Automatic Speech Recognition
LAS + SpecAugment
5.8
SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
Multi-Stream Self-Attention With Dilated 1D Convolutions
5.80
State-of-the-Art Speech Recognition Using Multi-Stream Self-Attention With Dilated 1D Convolutions
Transformer
5.7
A Comparative Study on Transformer vs RNN in Speech Applications
LSTM Transducer
5.6
Librispeech Transducer Model with Internal Language Model Prior Correction
0 of 53 row(s) selected.
Previous
Next