KaggleDBQA: Realistic Text-to-SQL dataset
KaggleDBQA is a challenging cross-domain and complex evaluation dataset of real Web databases, with domain-specific data types, original formatting, and unrestricted questions.
It expands upon contemporary cross-domain text-to-SQL datasets in three key aspects:
(1) Its databases are pulled from real-world data sources and not normalized.
(2) Its questions are authored in environments that mimic natural question answering.
(3) It also provides database documentation that contains rich in-domain knowledge.
Variants: KaggleDBQA
This dataset is used in 1 benchmark:
Task | Model | Paper | Date |
---|---|---|---|
Text-To-SQL | RAT-SQL | KaggleDBQA: Realistic Evaluation of Text-to-SQL … | 2021-06-22 |
Text-To-SQL | Edit-SQL | KaggleDBQA: Realistic Evaluation of Text-to-SQL … | 2021-06-22 |
Recent papers with results on this dataset: