4 months ago

Recurrent Neural Network Regularization

Wojciech Zaremba; Ilya Sutskever; Oriol Vinyals

Abstract

We present a simple regularization technique for Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM) units. Dropout, the most successful technique for regularizing neural networks, does not work well with RNNs and LSTMs. In this paper, we show how to correctly apply dropout to LSTMs, and show that it substantially reduces overfitting on a variety of tasks. These tasks include language modeling, speech recognition, image caption generation, and machine translation.

Code Repositories

MindSpore-scientific/code-14/tree/main/ReSeg

mindspore

simon-benigeri/lstm-language-model

pytorch

Mentioned in GitHub

rgarzonj/LSTMs

Mentioned in GitHub

Goodideax/lstm-negtive

pytorch

Mentioned in GitHub

martin-gorner/tensorflow-rnn-shakespeare

Mentioned in GitHub

wojzaremba/lstm

Official

Mentioned in GitHub

shivam13juna/Sequence_Prediction_LSTM_CHAR

Mentioned in GitHub

hjc18/language_modeling_lstm

pytorch

Mentioned in GitHub

MindSpore-scientific/code-5/tree/main/ReSeg

mindspore

nbansal90/bAbi_QA

Mentioned in GitHub

Goodideax/rnn_neg_efficient

pytorch

Mentioned in GitHub

jincan333/lot

pytorch

ahmetumutdurmus/zaremba

pytorch

Mentioned in GitHub

hikaruya8/lstm_model_py

pytorch

Mentioned in GitHub

floydhub/word-language-model

pytorch

Mentioned in GitHub

sebastianGehrmann/tensorflow-statereader

Mentioned in GitHub

FredericGodin/QuasiRNN-DReLU

Mentioned in GitHub

isi-nlp/Zoph_RNN

Mentioned in GitHub

dhecloud/simple_language_modelling

pytorch

Mentioned in GitHub

tmatha/lstm

Mentioned in GitHub

tomsercu/lstm

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
language-modelling-on-penn-treebank-word	Zaremba et al. (2014) - LSTM (large)	Test perplexity: 78.4 Validation perplexity: 82.2
language-modelling-on-penn-treebank-word	Zaremba et al. (2014) - LSTM (medium)	Test perplexity: 82.7 Validation perplexity: 86.2
machine-translation-on-wmt2014-english-french	Regularized LSTM	BLEU score: 29.03

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Recurrent Neural Network Regularization

Wojciech Zaremba; Ilya Sutskever; Oriol Vinyals

Abstract

Code Repositories

Benchmarks

Build AI with AI

Hyper Newsletters