DART

Dataset Information
Modalities
Texts
License
Unknown
Homepage

Overview

DART is a large dataset for open-domain structured data record to text generation. DART consists of 82,191 examples across different domains with each input being a semantic RDF triple set derived from data records in tables and the tree ontology of the schema, annotated with sentence descriptions that cover all facts in the triple set.

Source: DART: Open-Domain Structured Data Record to Text Generation

Variants: DART

Associated Benchmarks

This dataset is used in 3 benchmarks:

Recent Benchmark Submissions

Task Model Paper Date
Data-to-Text Generation self-mem + new data Self-training from Self-memory in Data-to-text … 2024-01-19
Text Generation Control Prefixes (T5-large) Control Prefixes for Parameter-Efficient Text … 2021-10-15
Table-to-Text Generation HTLM (fine-tuning) HTLM: Hyper-Text Pre-Training and Prompting … 2021-07-14
Table-to-Text Generation GPT-2-Large (fine-tuning) HTLM: Hyper-Text Pre-Training and Prompting … 2021-07-14
Text Generation T5 The GEM Benchmark: Natural Language … 2021-02-02
Text Generation BART The GEM Benchmark: Natural Language … 2021-02-02

Research Papers

Recent papers with results on this dataset: