HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Purely sequence-trained neural networks for ASR based on lattice-free MMI

{Sanjeev Khudanpur Xingyu Na Yiming Wang Daniel Povey Vimal Manohar Vijayaditya Peddinti Pegah Ghahrmani Daniel Galvez}

Purely sequence-trained neural networks for ASR based on lattice-free MMI

Abstract

In this paper we describe a method to perform sequence-discriminative training of neural network acoustic models without the need for frame-level cross-entropy pre-training. We use the lattice-free version of the maximum mutual information(MMI) criterion: LF-MMI. To make its computation feasible we use a phone n-gram language model, in place of the word language model. To further reduce its space and time complexity we compute the objective function using neural network outputs at one third the standard frame rate. These changes enable us to perform the computation for the forward-backward algorithm on GPUs. Further the reduced output frame-rate also provides a significant speed-up during decoding.We present results on 5 different LVCSR tasks with training data ranging from 100 to 2100 hours. Models trained with LFMMI provide a relative word error rate reduction of ∼11.5%, over those trained with cross-entropy objective function, and ∼8%, over those trained with cross-entropy and sMBR objective functions. A further reduction of ∼2.5%, relative, can be obtained by fine tuning these models with the word-lattice based sMBR objective function.

Benchmarks

BenchmarkMethodologyMetrics
speech-recognition-on-wsj-eval92tdnn + chain
Word Error Rate (WER): 2.32

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Purely sequence-trained neural networks for ASR based on lattice-free MMI | Papers | HyperAI