HyperAI
HyperAI超神经
首页
算力平台
文档
资讯
论文
教程
数据集
百科
SOTA
LLM 模型天梯
GPU 天梯
顶会
开源项目
全站搜索
关于
中文
HyperAI
HyperAI超神经
Toggle sidebar
全站搜索…
⌘
K
Command Palette
Search for a command to run...
首页
SOTA
语音合成
Speech Synthesis On Libritts
Speech Synthesis On Libritts
评估指标
M-STFT
MCD
PESQ
Periodicity
V/UV F1
评测结果
各个模型在此基准测试上的表现结果
Columns
模型名称
M-STFT
MCD
PESQ
Periodicity
V/UV F1
Paper Title
Repository
PeriodWave-Turbo-L
0.7358
-
4.454
0.0528
0.9756
Accelerating High-Fidelity Waveform Generation via Adversarial Flow Matching Optimization
BigVGAN-v2
0.7026
0.2903
4.362
0.0593
0.9793
BigVGAN: A Universal Neural Vocoder with Large-Scale Training
EVA-GAN-big
0.7982
-
4.3536
0.0751
0.9745
EVA-GAN: Enhanced Various Audio Generation via Scalable Generative Adversarial Networks
PeriodWave + FreeU
1.0269
-
4.248
0.0765
0.9651
PeriodWave: Multi-Period Flow Matching for High-Fidelity Waveform Generation
RFWave
-
-
4.228
0.090
0.968
RFWave: Multi-band Rectified Flow for Audio Waveform Reconstruction
BigVSAN (w/ snakebeta)
0.7992
0.4129
4.120
0.0924
0.9644
BigVSAN: Enhancing GAN-based Neural Vocoders with Slicing Adversarial Network
BigVSAN
0.7881
0.3381
4.116
0.0935
0.9635
BigVSAN: Enhancing GAN-based Neural Vocoders with Slicing Adversarial Network
EVA-GAN-base
0.9485
-
4.0330
0.0942
0.9658
EVA-GAN: Enhanced Various Audio Generation via Scalable Generative Adversarial Networks
BigVGAN
0.7997
0.3745
4.027
0.1018
0.9598
BigVGAN: A Universal Neural Vocoder with Large-Scale Training
Vocos
-
-
3.70
0.101
0.9582
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
BigVGAN-base
0.8788
0.4564
3.519
0.1287
0.9459
BigVGAN: A Universal Neural Vocoder with Large-Scale Training
WaveGlow
1.3099
2.3591
3.138
0.1485
0.9378
WaveGlow: A Flow-based Generative Network for Speech Synthesis
WaveFlow
1.1120
1.2455
3.027
0.1416
0.9410
WaveFlow: A Compact Flow-based Model for Raw Audio
HiFi-GAN
1.0017
0.6603
2.947
0.1565
0.9300
HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis
SC-WaveRNN
2.2358
1.8854
1.701
0.3044
0.8144
Speaker Conditional WaveRNN: Towards Universal Neural Vocoder for Unseen Speaker and Recording Conditions
0 of 15 row(s) selected.
Previous
Next
Speech Synthesis On Libritts | SOTA | HyperAI超神经