HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Advancing State of the Art in Language Modeling

David Herel Tomas Mikolov

Advancing State of the Art in Language Modeling

Abstract

Generalization is arguably the most important goal of statistical language modeling research. Publicly available benchmarks and papers published with an open-source code have been critical to advancing the field. However, it is often very difficult, and sometimes even impossible, to reproduce the results fully as reported in publications. In this paper, we propose a simple framework that should help advance the state of the art in language modeling in terms of generalization. We propose to publish not just the code, but also probabilities on dev and test sets with future publications so that one can easily add the new model into an ensemble. This has crucial advantages: it is much easier to determine whether a newly proposed model is actually complementary to the current baseline. Therefore, instead of inventing new names for the old tricks, the scientific community can advance faster. Finally, this approach promotes diversity of ideas: one does not need to create an individual model that is the new state of the art to attract attention; it will be sufficient to develop a new model that learns patterns which other models do not. Thus, even a suboptimal model can be found to have value. Remarkably, our approach has yielded new state-of-the-art results across various language modeling benchmarks up to 10%.

Code Repositories

davidherel/sota_lm
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
language-modelling-on-penn-treebank-wordEnsemble of All
Test perplexity: 47.31
Validation perplexity: 48.92
language-modelling-on-wikitext-103Ensemble of All
Test perplexity: 13.29
Validation perplexity: 13.11
language-modelling-on-wikitext-2Ensemble of All
Test perplexity: 53.73
Validation perplexity: 55.4

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp