HyperAI
HyperAI超神经
首页
算力平台
文档
资讯
论文
教程
数据集
百科
SOTA
LLM 模型天梯
GPU 天梯
顶会
开源项目
全站搜索
关于
中文
HyperAI
HyperAI超神经
Toggle sidebar
全站搜索…
⌘
K
Command Palette
Search for a command to run...
首页
SOTA
语言建模
Language Modelling On Wikitext 103
Language Modelling On Wikitext 103
评估指标
Number of params
Test perplexity
评测结果
各个模型在此基准测试上的表现结果
Columns
模型名称
Number of params
Test perplexity
Paper Title
Repository
LSTM
-
48.7
Improving Neural Language Models with a Continuous Cache
Temporal CNN
-
45.2
Convolutional Sequence Modeling Revisited
-
TCN
-
45.19
An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
GCNN-8
-
44.9
Language Modeling with Gated Convolutional Networks
Neural cache model (size = 100)
-
44.8
Improving Neural Language Models with a Continuous Cache
Neural cache model (size = 2,000)
-
40.8
Improving Neural Language Models with a Continuous Cache
GPT-2 Small
124M
37.50
Language Models are Unsupervised Multitask Learners
-
GCNN-8
-
37.2
Language Modeling with Gated Convolutional Networks
LSTM
-
36.4
Fast Parametric Learning with Activation Memorization
-
LSTM (Hebbian)
-
34.3
Fast Parametric Learning with Activation Memorization
-
4 layer QRNN
151M
33.0
An Analysis of Neural Language Modeling at Multiple Scales
AWD-LSTM-MoS + ATOI
-
32.85
Alleviating Sequence Information Loss with Data Overlapping and Prime Batch Sizes
DEQ-Transformer (small)
138M
32.4
Deep Equilibrium Models
LSTM (RMC)
-
31.6
Relational recurrent neural networks
Primal.+Trans.
-
31.0
Primal-Attention: Self-attention through Asymmetric Kernel SVD in Primal Representation
Rfa-Gate-Gaussian-Stateful (Small)
-
30.5
Random Feature Attention
-
LSTM (Hebbian, Cache)
-
29.7
Fast Parametric Learning with Activation Memorization
-
LSTM (Hebbian, Cache, MbPA)
-
29.2
Fast Parametric Learning with Activation Memorization
-
Trellis Network
-
29.19
Trellis Networks for Sequence Modeling
DEQ-TrellisNet
180M
29.0
Deep Equilibrium Models
0 of 89 row(s) selected.
Previous
Next
Language Modelling On Wikitext 103 | SOTA | HyperAI超神经