HyperAI
HyperAI
Home
News
Latest Papers
Tutorials
Datasets
Wiki
SOTA
LLM Models
GPU Leaderboard
Events
Search
About
English
HyperAI
HyperAI
Toggle sidebar
Search the site…
⌘
K
Home
SOTA
Speech Recognition
Speech Recognition On Aishell 1
Speech Recognition On Aishell 1
Metrics
Params(M)
Word Error Rate (WER)
Results
Performance results of various models on this benchmark
Columns
Model Name
Params(M)
Word Error Rate (WER)
Paper Title
Repository
U2
47
4.72
Unified Streaming and Non-streaming Two-pass End-to-end Model for Speech Recognition
-
Zipformer+CR-CTC (no external language model)
66.2
4.02
CR-CTC: Consistency regularization on CTC for improved speech recognition
-
Paraformer
46.3
4.95
FunASR: A Fundamental End-to-End Speech Recognition Toolkit
-
Qwen-Audio
-
1.29
Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models
-
Lightweight Transducer With LM
45.3
4.03
Lightweight Transducer Based on Frame-Level Criterion
-
Paraformer-large
220
1.95
FunASR: A Fundamental End-to-End Speech Recognition Toolkit
-
Att
-
18.7
End-to-end Speech Recognition with Adaptive Computation Steps
-
SE-WSBO With LM
46
4.1
Improving Mandarin Speech Recogntion with Block-augmented Transformer
-
CTC/Att
-
6.7
A Comparative Study on Transformer vs RNN in Speech Applications
-
CTC-CRF 4gram-LM
-
6.34
CAT: A CTC-CRF based ASR Toolkit Bridging the Hybrid and the End-to-end Approaches towards Data Efficiency and Low Latency
-
UMA
44.7
4.7
Unimodal Aggregation for CTC-based Speech Recognition
-
MMSpeech With LM
-
1.9
MMSpeech: Multi-modal Multi-task Encoder-Decoder Pre-training for Speech Recognition
-
CIF-HKD With LM
47
4.1
Knowledge Transfer from Pre-trained Language Models to Cif-based Speech Recognizers via Hierarchical Distillation
-
BRA-E
8.5
6.63
Beyond Universal Transformer: block reusing with adaptor in Transformer for automatic speech recognition
-
FireRedASR-AED
1,100
0.55
FireRedASR: Open-Source Industrial-Grade Mandarin Speech Recognition Models from Encoder-Decoder to LLM Integration
-
Seed-ASR
-
0.68
Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition
-
Lightweight Transducer
45.3
4.31
Lightweight Transducer Based on Frame-Level Criterion
-
BAT
90
4.97
BAT: Boundary aware transducer for memory-efficient and low-latency ASR
-
0 of 18 row(s) selected.
Previous
Next
Speech Recognition On Aishell 1 | SOTA | HyperAI