HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

Skip-gram Language Modeling Using Sparse Non-negative Matrix Probability Estimation

Noam Shazeer; Joris Pelemans; Ciprian Chelba

Skip-gram Language Modeling Using Sparse Non-negative Matrix Probability Estimation

Abstract

We present a novel family of language model (LM) estimation techniques named Sparse Non-negative Matrix (SNM) estimation. A first set of experiments empirically evaluating it on the One Billion Word Benchmark shows that SNM $n$-gram LMs perform almost as well as the well-established Kneser-Ney (KN) models. When using skip-gram features the models are able to match the state-of-the-art recurrent neural network (RNN) LMs; combining the two modeling techniques yields the best known result on the benchmark. The computational advantages of SNM over both maximum entropy and RNN LM estimation are probably its main strength, promising an approach that has the same flexibility in combining arbitrary features effectively and yet should scale to very large amounts of data as gracefully as $n$-gram LMs do.

Benchmarks

BenchmarkMethodologyMetrics
language-modelling-on-one-billion-wordSparse Non-Negative
Number of params: 33B
PPL: 52.9

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Skip-gram Language Modeling Using Sparse Non-negative Matrix Probability Estimation | Papers | HyperAI