HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

UniverSLU: Universal Spoken Language Understanding for Diverse Tasks with Natural Language Instructions

Siddhant Arora Hayato Futami Jee-weon Jung Yifan Peng Roshan Sharma Yosuke Kashiwagi Emiru Tsunoo Karen Livescu Shinji Watanabe

UniverSLU: Universal Spoken Language Understanding for Diverse Tasks with Natural Language Instructions

Abstract

Recent studies leverage large language models with multi-tasking capabilities, using natural language prompts to guide the model's behavior and surpassing performance of task-specific models. Motivated by this, we ask: can we build a single model that jointly performs various spoken language understanding (SLU) tasks? We start by adapting a pre-trained automatic speech recognition model to additional tasks using single-token task specifiers. We enhance this approach through instruction tuning, i.e., finetuning by describing the task using natural language instructions followed by the list of label options. Our approach can generalize to new task descriptions for the seen tasks during inference, thereby enhancing its user-friendliness. We demonstrate the efficacy of our single multi-task learning model "UniverSLU" for 12 speech classification and sequence generation task types spanning 17 datasets and 9 languages. On most tasks, UniverSLU achieves competitive performance and often even surpasses task-specific models. Additionally, we assess the zero-shot capabilities, finding that the model generalizes to new datasets and languages for seen task types.

Benchmarks

BenchmarkMethodologyMetrics
spoken-language-understanding-on-fluentUniverSLU
Accuracy (%): 99.8

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
UniverSLU: Universal Spoken Language Understanding for Diverse Tasks with Natural Language Instructions | Papers | HyperAI