ToTTo

Name: ToTTo
Published: 2020-04-29
License: Unknown

Dataset Information

Modalities

Texts

Languages

English

Introduced

2020

License

Unknown

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

ToTTo is an open-domain English table-to-text dataset with over 120,000 training examples that proposes a controlled generation task: given a Wikipedia table and a set of highlighted table cells, produce a one-sentence description.

During the dataset creation process, tables from English Wikipedia are matched with (noisy) descriptions. Each table cell mentioned in the description is highlighted and the descriptions are iteratively cleaned and corrected to faithfully reflect the content of the highlighted cells.

Source: Google Research Datasets

Variants: ToTTo

Associated Benchmarks

This dataset is used in 1 benchmark:

Data-to-Text Generation - Metrics: BLEU, PARENT, METEOR

Recent Benchmark Submissions

Task	Model	Paper	Date
Data-to-Text Generation	LATTICE (T5-base)	Robust (Controlled) Table-to-Text Generation with …	2022-05-08
Data-to-Text Generation	T5	The GEM Benchmark: Natural Language …	2021-02-02
Data-to-Text Generation	T5-3B	Text-to-Text Pre-Training for Data-to-Text Tasks	2020-05-21
Data-to-Text Generation	BERT-to-BERT	ToTTo: A Controlled Table-To-Text Generation …	2020-04-29
Data-to-Text Generation	Pointer Generator	ToTTo: A Controlled Table-To-Text Generation …	2020-04-29
Data-to-Text Generation	NCP+CC (Puduppully et al 2019)	ToTTo: A Controlled Table-To-Text Generation …	2020-04-29

Research Papers

Recent papers with results on this dataset:

External Links:

ToTTo

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview