Command Palette
Search for a command to run...
ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language Generation
Dongling Xiao; Han Zhang; Yukun Li; Yu Sun; Hao Tian; Hua Wu; Haifeng Wang

Abstract
Current pre-training works in natural language generation pay little attention to the problem of exposure bias on downstream tasks. To address this issue, we propose an enhanced multi-flow sequence to sequence pre-training and fine-tuning framework named ERNIE-GEN, which bridges the discrepancy between training and inference with an infilling generation mechanism and a noise-aware generation method. To make generation closer to human writing patterns, this framework introduces a span-by-span generation flow that trains the model to predict semantically-complete spans consecutively rather than predicting word by word. Unlike existing pre-training methods, ERNIE-GEN incorporates multi-granularity target sampling to construct pre-training data, which enhances the correlation between encoder and decoder. Experimental results demonstrate that ERNIE-GEN achieves state-of-the-art results with a much smaller amount of pre-training data and parameters on a range of language generation tasks, including abstractive summarization (Gigaword and CNN/DailyMail), question generation (SQuAD), dialogue generation (Persona-Chat) and generative question answering (CoQA).
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| abstractive-text-summarization-on-cnn-daily | ERNIE-GENLARGE (large-scale text corpora) | ROUGE-1: 44.31 ROUGE-2: 21.35 ROUGE-L: 41.60 |
| abstractive-text-summarization-on-cnn-daily | ERNIE-GENBASE | ROUGE-1: 42.30 ROUGE-2: 19.92 ROUGE-L: 39.68 |
| abstractive-text-summarization-on-cnn-daily | ERNIE-GENLARGE | ROUGE-1: 44.02 ROUGE-2: 21.17 ROUGE-L: 41.26 |
| generative-question-answering-on-coqa | ERNIE-GEN | F1-Score: 84.5 |
| question-generation-on-squad11 | ERNIE-GENLARGE (beam size=5) | BLEU-4: 25.41 |
| text-summarization-on-gigaword | ERNIE-GENLARGE (large-scale text corpora) | ROUGE-1: 39.46 ROUGE-2: 20.34 ROUGE-L: 36.74 |
| text-summarization-on-gigaword | ERNIE-GENBASE | ROUGE-1: 38.83 ROUGE-2: 20.04 ROUGE-L: 36.20 |
| text-summarization-on-gigaword | ERNIE-GENLARGE | ROUGE-1: 39.25 ROUGE-2: 20.25 ROUGE-L: 36.53 |
| text-summarization-on-gigaword-10k | ERNIE-GENLARGE | ROUGE-1: 35.05 ROUGE-2: 16.10 ROUGE-L: 32.50 |
| text-summarization-on-gigaword-10k | ERNIE-GENBASE | ROUGE-1: 33.75 ROUGE-2: 15.23 ROUGE-L: 31.35 |
| text-summarization-on-gigaword-10k | ERNIE-GENLARGE (large-scale text corpora) | ROUGE-1: 35.51 ROUGE-2: 16.79 ROUGE-L: 33.23 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.