4 months ago

Reaching Human-level Performance in Automatic Grammatical Error Correction: An Empirical Study

Tao Ge; Furu Wei; Ming Zhou

Abstract

Neural sequence-to-sequence (seq2seq) approaches have proven to be successful in grammatical error correction (GEC). Based on the seq2seq framework, we propose a novel fluency boost learning and inference mechanism. Fluency boosting learning generates diverse error-corrected sentence pairs during training, enabling the error correction model to learn how to improve a sentence's fluency from more instances, while fluency boosting inference allows the model to correct a sentence incrementally with multiple inference steps. Combining fluency boost learning and inference with convolutional seq2seq models, our approach achieves the state-of-the-art performance: 75.72 (F_{0.5}) on CoNLL-2014 10 annotation dataset and 62.42 (GLEU) on JFLEG test set respectively, becoming the first GEC system that reaches human-level performance (72.58 for CoNLL and 62.37 for JFLEG) on both of the benchmarks.

Code Repositories

getao/human-performance-gec

Official

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
grammatical-error-correction-on-unrestricted	CNN Seq2Seq + Fluency Boost and inference	GLEU: 62.37
grammatical-error-correction-on-unrestricted	CNN Seq2Seq + Fluency Boost	F0.5: 61.34

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette