HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

A Low-Resource Approach to the Grammatical Error Correction of Ukrainian

{and Dan Roth Alla Rozovskaya Frank Palma Gomez}

A Low-Resource Approach to the Grammatical Error Correction of Ukrainian

Abstract

We present our system that participated in the shared task on the grammatical error correction of Ukrainian. We have implemented two approaches that make use of large pre-trained language models and synthetic data, that have been used for error correction of English as well as low-resource languages. The first approach is based on fine-tuning a large multilingual language model (mT5) in two stages: first, on synthetic data, and then on gold data. The second approach trains a (smaller) seq2seq Transformer model pre-trained on synthetic data and fine-tuned on gold data. Our mT5-based model scored first in “GEC only” track, and a very close second in the “GEC+Fluency” track. Our two key innovations are (1) finetuning in stages, first on synthetic, and then on gold data; and (2) a high-quality corruption method based on roundtrip machine translation to complement existing noisification approaches.

Benchmarks

BenchmarkMethodologyMetrics
grammatical-error-correction-on-ua-gecmT5 large + 10M synth
F0.5: 68.09

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
A Low-Resource Approach to the Grammatical Error Correction of Ukrainian | Papers | HyperAI