6 months ago

Abstract

Current research in text simplification has been hampered by two central problems: (i) the small amount of high-quality parallel simplification data available, and (ii) the lack of explicit annotations of simplification operations, such as deletions or substitutions, on existing data. While the recently introduced Newsela corpus has alleviated the first problem, simplifications still need to be learned directly from parallel text using black-box, end-to-end approaches rather than from explicit annotations. These complex-simple parallel sentence pairs often differ to such a high degree that generalization becomes difficult. End-to-end models also make it hard to interpret what is actually learned from data. We propose a method that decomposes the task of TS into its sub-problems. We devise a way to automatically identify operations in a parallel corpus and introduce a sequence-labeling approach based on these annotations. Finally, we provide insights on the types of transformations that different approaches can model.

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

6 months ago

Natural Language Processing

Dataset

Document Understanding

AI Infra

Natural Language Processing

Task/Problem

Carolina Scarton o Fern Alva-Manchego Lucia Specia Joachim Bingel Gustavo Paetzold

Abstract

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

HyperAI

6 months ago

Natural Language Processing

Dataset

Document Understanding

AI Infra

Natural Language Processing

Task/Problem

Carolina Scarton o Fern Alva-Manchego Lucia Specia Joachim Bingel Gustavo Paetzold

Abstract

Source PDF View Code

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started View Pricing

HyperAI Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Learning How to Simplify From Explicit Labeling of Complex-Simplified Text Pairs

Carolina Scarton o Fern Alva-Manchego Lucia Specia Joachim Bingel Gustavo Paetzold

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

Learning How to Simplify From Explicit Labeling of Complex-Simplified Text Pairs

Carolina Scarton o Fern Alva-Manchego Lucia Specia Joachim Bingel Gustavo Paetzold

Abstract

Build AI with AI

HyperAI Newsletters

Command Palette

Learning How to Simplify From Explicit Labeling of Complex-Simplified Text Pairs

Carolina Scarton o Fern Alva-Manchego Lucia Specia Joachim Bingel Gustavo Paetzold

Abstract

Build AI with AI

HyperAI Newsletters