HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

BAT: Boundary aware transducer for memory-efficient and low-latency ASR

Keyu An Xian Shi Shiliang Zhang

BAT: Boundary aware transducer for memory-efficient and low-latency ASR

Abstract

Recently, recurrent neural network transducer (RNN-T) gains increasing popularity due to its natural streaming capability as well as superior performance. Nevertheless, RNN-T training requires large time and computation resources as RNN-T loss calculation is slow and consumes a lot of memory. Another limitation of RNN-T is that it tends to access more contexts for better performance, thus leading to higher emission latency in streaming ASR. In this paper we propose boundary-aware transducer (BAT) for memory-efficient and low-latency ASR. In BAT, the lattice for RNN-T loss computation is reduced to a restricted region selected by the alignment from continuous integrate-and-fire (CIF), which is jointly optimized with the RNN-T model. Extensive experiments demonstrate that compared to RNN-T, BAT reduces time and memory consumption significantly in training, and achieves good CER-latency trade-offs in inference for streaming ASR.

Code Repositories

Benchmarks

BenchmarkMethodologyMetrics
speech-recognition-on-aishell-1BAT
Params(M): 90
Word Error Rate (WER): 4.97

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
BAT: Boundary aware transducer for memory-efficient and low-latency ASR | Papers | HyperAI