ToTTo is an open-domain English table-to-text dataset with over 120,000 training examples that proposes a controlled generation task: given a Wikipedia table and a set of highlighted table cells, produce a one-sentence description.
During the dataset creation process, tables from English Wikipedia are matched with (noisy) descriptions. Each table cell mentioned in the description is highlighted and the descriptions are iteratively cleaned and corrected to faithfully reflect the content of the highlighted cells.
Source: Google Research Datasets
Variants: ToTTo
This dataset is used in 1 benchmark:
Task | Model | Paper | Date |
---|---|---|---|
Data-to-Text Generation | LATTICE (T5-base) | Robust (Controlled) Table-to-Text Generation with … | 2022-05-08 |
Data-to-Text Generation | T5 | The GEM Benchmark: Natural Language … | 2021-02-02 |
Data-to-Text Generation | T5-3B | Text-to-Text Pre-Training for Data-to-Text Tasks | 2020-05-21 |
Data-to-Text Generation | BERT-to-BERT | ToTTo: A Controlled Table-To-Text Generation … | 2020-04-29 |
Data-to-Text Generation | Pointer Generator | ToTTo: A Controlled Table-To-Text Generation … | 2020-04-29 |
Data-to-Text Generation | NCP+CC (Puduppully et al 2019) | ToTTo: A Controlled Table-To-Text Generation … | 2020-04-29 |
Recent papers with results on this dataset: