5 months ago

Zephyr: Direct Distillation of LM Alignment

Tunstall Lewis ; Beeching Edward ; Lambert Nathan ; Rajani Nazneen ; Rasul Kashif ; Belkada Younes ; Huang Shengyi ; von Werra Leandro ; Fourrier Clémentine ; Habib

Abstract

We aim to produce a smaller language model that is aligned to user intent.Previous research has shown that applying distilled supervised fine-tuning(dSFT) on larger models significantly improves task accuracy; however, thesemodels are unaligned, i.e. they do not respond well to natural prompts. Todistill this property, we experiment with the use of preference data from AIFeedback (AIF). Starting from a dataset of outputs ranked by a teacher model,we apply distilled direct preference optimization (dDPO) to learn a chat modelwith significantly improved intent alignment. The approach requires only a fewhours of training without any additional sampling during fine-tuning. The finalresult, Zephyr-7B, sets the state-of-the-art on chat benchmarks for 7Bparameter models, and requires no human annotation. In particular, results onMT-Bench show that Zephyr-7B surpasses Llama2-Chat-70B, the best open-accessRLHF-based model. Code, models, data, and tutorials for the system areavailable at https://github.com/huggingface/alignment-handbook.

Code Repositories

Savannah120/alignment-handbook-PoFT

pytorch

Mentioned in GitHub

huggingface/alignment-handbook

pytorch

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
few-shot-learning-on-medconceptsqa	HuggingFaceH4/zephyr-7b-beta	Accuracy: 25.058
zero-shot-learning-on-medconceptsqa	HuggingFaceH4/zephyr-7b-beta	Accuracy: 25.538

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette