HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

ToTTo: A Controlled Table-To-Text Generation Dataset

Ankur P. Parikh Xuezhi Wang Sebastian Gehrmann Manaal Faruqui Bhuwan Dhingra Diyi Yang Dipanjan Das

ToTTo: A Controlled Table-To-Text Generation Dataset

Abstract

We present ToTTo, an open-domain English table-to-text dataset with over 120,000 training examples that proposes a controlled generation task: given a Wikipedia table and a set of highlighted table cells, produce a one-sentence description. To obtain generated targets that are natural but also faithful to the source table, we introduce a dataset construction process where annotators directly revise existing candidate sentences from Wikipedia. We present systematic analyses of our dataset and annotation process as well as results achieved by several state-of-the-art baselines. While usually fluent, existing methods often hallucinate phrases that are not supported by the table, suggesting that this dataset can serve as a useful research benchmark for high-precision conditional text generation.

Code Repositories

google-research-datasets/ToTTo
Official
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
data-to-text-generation-on-tottoNCP+CC (Puduppully et al 2019)
BLEU: 19.2
PARENT: 29.2
data-to-text-generation-on-tottoBERT-to-BERT
BLEU: 44
PARENT: 52.6
data-to-text-generation-on-tottoPointer Generator
BLEU: 41.6
PARENT: 51.6

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp