HyperAI
HyperAI超神经
首页
算力平台
文档
资讯
论文
教程
数据集
百科
SOTA
LLM 模型天梯
GPU 天梯
顶会
开源项目
全站搜索
关于
中文
HyperAI
HyperAI超神经
Toggle sidebar
全站搜索…
⌘
K
Command Palette
Search for a command to run...
首页
SOTA
语言建模
Language Modelling On One Billion Word
Language Modelling On One Billion Word
评估指标
Number of params
PPL
评测结果
各个模型在此基准测试上的表现结果
Columns
模型名称
Number of params
PPL
Paper Title
Repository
Sparse Non-Negative
33B
52.9
Skip-gram Language Modeling Using Sparse Non-negative Matrix Probability Estimation
-
RNN-1024 + 9 Gram
20B
51.3
One Billion Word Benchmark for Measuring Progress in Statistical Language Modeling
GPT-2
1.54B
42.16
Language Models are Unsupervised Multitask Learners
-
BIG G-LSTM-2
-
36.0
Factorization tricks for LSTM networks
Low-Budget MoE
5B
34.1
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
GCNN-14 bottleneck
-
31.9
Language Modeling with Gated Convolutional Networks
LSTM-8192-1024
1.8B
30.6
Exploring the Limits of Language Modeling
LSTM-8192-1024 + CNN Input
1.04B
30.0
Exploring the Limits of Language Modeling
Evolved Transformer Big
-
28.6
The Evolved Transformer
High-Budget MoE
5B
28.0
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
DynamicConv
0.34B
26.67
Pay Less Attention with Lightweight and Dynamic Convolutions
SRU++
328M
25.1
When Attention Meets Fast Recurrence: Training Language Models with Reduced Compute
Cohere Large
-
25.06
-
-
Mesh Tensorflow
4.9B
24.0
Mesh-TensorFlow: Deep Learning for Supercomputers
Adaptive Input Large
0.46B
23.91
Adaptive Input Representations for Neural Language Modeling
10 LSTM+CNN inputs + SNM10-SKIP (ensemble)
43B
23.7
Exploring the Limits of Language Modeling
Transformer-XL Base
0.46B
23.5
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
SRU++ Large
465M
23.5
When Attention Meets Fast Recurrence: Training Language Models with Reduced Compute
Adaptive Input Very Large
1.0B
23.02
Adaptive Input Representations for Neural Language Modeling
MDLM
110M
23.00
Simple and Effective Masked Diffusion Language Models
0 of 27 row(s) selected.
Previous
Next
Language Modelling On One Billion Word | SOTA | HyperAI超神经