HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

GOLD: Generalized Knowledge Distillation via Out-of-Distribution-Guided Language Data Generation

Gholami Mohsen ; Akbari Mohammad ; Hu Cindy ; Masrani Vaden ; Wang Z. Jane ; Zhang Yong

GOLD: Generalized Knowledge Distillation via Out-of-Distribution-Guided
  Language Data Generation

Abstract

Knowledge distillation from LLMs is essential for the efficient deployment oflanguage models. Prior works have proposed data generation using LLMs forpreparing distilled models. We argue that generating data with LLMs is prone tosampling mainly from the center of original content distribution. Thislimitation hinders the distilled model from learning the true underlying datadistribution and to forget the tails of the distributions (samples with lowerprobability). To this end, we propose GOLD, a task-agnostic data generation andknowledge distillation framework, which employs an iterativeout-of-distribution-guided feedback mechanism for the LLM. As a result, thegenerated data improves the generalizability of distilled models. Anenergy-based OOD evaluation approach is also introduced to deal with noisygenerated data. Our extensive experiments on 10 different classification andsequence-to-sequence tasks in NLP show that GOLD respectively outperforms priorarts and the LLM with an average improvement of 5% and 14%. We will also showthat the proposed method is applicable to less explored and novel tasks. Thecode is available.

Benchmarks

BenchmarkMethodologyMetrics
data-free-knowledge-distillation-on-qnliGOLD (T5-base)
Accuracy: 91.7
data-free-knowledge-distillation-on-squadGOLD (T5-base)
Exact Match: 75.2

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
GOLD: Generalized Knowledge Distillation via Out-of-Distribution-Guided Language Data Generation | Papers | HyperAI