HyperAI超神经

Language Modelling On Penn Treebank Word

评估指标

Params
Test perplexity
Validation perplexity

评测结果

各个模型在此基准测试上的表现结果

比较表格
模型名称ParamsTest perplexityValidation perplexity
gradual-learning-of-recurrent-neural-networks26M46.3446.64
tying-word-vectors-and-word-classifiers-a-66.068.1
partially-shuffling-the-training-data-to-122M53.9255.89
partially-shuffling-the-training-data-to-123M52.053.79
a-theoretically-grounded-application-of-79.781.9
improving-neural-language-modeling-via22M46.0146.63
direct-output-connection-for-a-high-rank185M47.1748.63
pushing-the-bounds-of-dropout24M55.357.1
neural-architecture-search-with-reinforcement25M64.0-
trellis-networks-for-sequence-modeling-54.19-
mogrifier-lstm24M44.944.8
deep-equilibrium-models24M57.1-
an-empirical-evaluation-of-generic-78.93-
seq-u-net-a-one-dimensional-causal-u-net-for14.7M108.47-
dynamic-evaluation-of-neural-sequence-models24M51.151.6
recurrent-highway-networks23M65.467.9
frage-frequency-agnostic-word-representation22M46.5447.38
regularizing-and-optimizing-lstm-language24M52.853.9
improved-language-modeling-by-decoding-the22M47.348.0
seq-u-net-a-one-dimensional-causal-u-net-for14.9M107.95-
direct-output-connection-for-a-high-rank23M52.3854.12
deep-independently-recurrent-neural-network-50.97-
deep-independently-recurrent-neural-network-56.37-
regularizing-and-optimizing-lstm-language24M57.360.0
recurrent-neural-network-regularization-78.482.2
transformer-xl-attentive-language-models24M54.5556.72
fraternal-dropout24M56.858.9
learning-associative-inference-using-fast-124M54.4856.76
r-transformer-recurrent-neural-network-84.38-
efficient-neural-architecture-search-via-124M 58.660.8
an-empirical-evaluation-of-generic-92.48-
breaking-the-softmax-bottleneck-a-high-rank22M47.6948.33
190409408395M31.336.1
autodropout-learning-dropout-patterns-to-54.958.1
breaking-the-softmax-bottleneck-a-high-rank22M54.4456.54
deep-residual-output-layers-for-neural24M49.449.5
darts-differentiable-architecture-search23M56.158.3
a-theoretically-grounded-application-of-75.277.9
advancing-state-of-the-art-in-language-47.3148.92
language-models-are-few-shot-learners175000M20.5-
language-models-are-unsupervised-multitask1542M35.76-
deep-residual-output-layers-for-neural24M55.758.2
recurrent-neural-network-regularization-82.786.2