5 months ago

GOLD: Generalized Knowledge Distillation via Out-of-Distribution-Guided Language Data Generation

Gholami Mohsen ; Akbari Mohammad ; Hu Cindy ; Masrani Vaden ; Wang Z. Jane ; Zhang Yong

Abstract

Knowledge distillation from LLMs is essential for the efficient deployment oflanguage models. Prior works have proposed data generation using LLMs forpreparing distilled models. We argue that generating data with LLMs is prone tosampling mainly from the center of original content distribution. Thislimitation hinders the distilled model from learning the true underlying datadistribution and to forget the tails of the distributions (samples with lowerprobability). To this end, we propose GOLD, a task-agnostic data generation andknowledge distillation framework, which employs an iterativeout-of-distribution-guided feedback mechanism for the LLM. As a result, thegenerated data improves the generalizability of distilled models. Anenergy-based OOD evaluation approach is also introduced to deal with noisygenerated data. Our extensive experiments on 10 different classification andsequence-to-sequence tasks in NLP show that GOLD respectively outperforms priorarts and the LLM with an average improvement of 5% and 14%. We will also showthat the proposed method is applicable to less explored and novel tasks. Thecode is available.

Benchmarks

Benchmark	Methodology	Metrics
data-free-knowledge-distillation-on-qnli	GOLD (T5-base)	Accuracy: 91.7
data-free-knowledge-distillation-on-squad	GOLD (T5-base)	Exact Match: 75.2

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning