3 months ago

ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training

Weizhen Qi Yu Yan Yeyun Gong Dayiheng Liu Nan Duan Jiusheng Chen Ruofei Zhang Ming Zhou

Abstract

This paper presents a new sequence-to-sequence pre-training model called ProphetNet, which introduces a novel self-supervised objective named future n-gram prediction and the proposed n-stream self-attention mechanism. Instead of optimizing one-step-ahead prediction in the traditional sequence-to-sequence model, the ProphetNet is optimized by n-step ahead prediction that predicts the next n tokens simultaneously based on previous context tokens at each time step. The future n-gram prediction explicitly encourages the model to plan for the future tokens and prevent overfitting on strong local correlations. We pre-train ProphetNet using a base scale dataset (16GB) and a large-scale dataset (160GB), respectively. Then we conduct experiments on CNN/DailyMail, Gigaword, and SQuAD 1.1 benchmarks for abstractive summarization and question generation tasks. Experimental results show that ProphetNet achieves new state-of-the-art results on all these datasets compared to the models using the same scale pre-training corpus.

Code Repositories

d294270681/ProphetNet-paddle

paddle

microsoft/ProphetNet

Official

pytorch

Mentioned in GitHub

huggingface/transformers

pytorch

Mentioned in GitHub

microsoft/ar2

pytorch

Mentioned in GitHub

MS-P3/code7/tree/main/xlm_prophetnet

mindspore

Benchmarks

Benchmark	Methodology	Metrics
abstractive-text-summarization-on-cnn-daily	ProphetNet	ROUGE-1: 44.20 ROUGE-2: 21.17 ROUGE-L: 41.30
question-generation-on-squad11	ProphetNet	BLEU-4: 23.91 METEOR: 26.6 ROUGE-L: 52.3
text-summarization-on-gigaword	ProphetNet	ROUGE-1: 39.51 ROUGE-2: 20.42 ROUGE-L: 36.69

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training

Weizhen Qi Yu Yan Yeyun Gong Dayiheng Liu Nan Duan Jiusheng Chen Ruofei Zhang Ming Zhou

Abstract

Code Repositories

Benchmarks

Build AI with AI

Hyper Newsletters