HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

On the State of the Art of Evaluation in Neural Language Models

Gábor Melis; Chris Dyer; Phil Blunsom

On the State of the Art of Evaluation in Neural Language Models

Abstract

Ongoing innovations in recurrent neural network architectures have provided a steady influx of apparently state-of-the-art results on language modelling benchmarks. However, these have been evaluated using differing code bases and limited computational resources, which represent uncontrolled sources of experimental variation. We reevaluate several popular architectures and regularisation methods with large-scale automatic black-box hyperparameter tuning and arrive at the somewhat surprising conclusion that standard LSTM architectures, when properly regularised, outperform more recent models. We establish a new state of the art on the Penn Treebank and Wikitext-2 corpora, as well as strong baselines on the Hutter Prize dataset.

Code Repositories

deepmind/lamb
tf
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
language-modelling-on-wikitext-2Melis et al. (2017) - 1-layer LSTM (tied)
Number of params: 24M
Test perplexity: 65.9
Validation perplexity: 69.3

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
On the State of the Art of Evaluation in Neural Language Models | Papers | HyperAI