HotpotQA

Dataset Information
Modalities
Texts
Languages
English
Introduced
2018
License
Homepage

Overview

HotpotQA is a question answering dataset collected on the English Wikipedia, containing about 113K crowd-sourced questions that are constructed to require the introduction paragraphs of two Wikipedia articles to answer. Each question in the dataset comes with the two gold paragraphs, as well as a list of sentences in these paragraphs that crowdworkers identify as supporting facts necessary to answer the question.

A diverse range of reasoning strategies are featured in HotpotQA, including questions involving missing entities in the question, intersection questions (What satisfies property A and property B?), and comparison questions, where two entities are compared by a common attribute, among others. In the few-document distractor setting, the QA models are given ten paragraphs in which the gold paragraphs are guaranteed to be found; in the open-domain fullwiki setting, the models are only given the question and the entire Wikipedia. Models are evaluated on their answer accuracy and explainability, where the former is measured as overlap between the predicted and gold answers with exact match (EM) and unigram F1, and the latter concerns how well the predicted supporting fact sentences match human annotation (Supporting Fact EM/F1). A joint metric is also reported on this dataset, which encourages systems to perform well on both tasks simultaneously.

Source: Answering Complex Open-domain Questions Through Iterative Query Generation
Image Source: Yang et al

Variants: HotpotQA, hotpot_qa

Associated Benchmarks

This dataset is used in 2 benchmarks:

Recent Benchmark Submissions

Task Model Paper Date
Retrieval BM25S BM25S: Orders of magnitude faster … 2024-07-04
Retrieval Rank-BM25 BM25S: Orders of magnitude faster … 2024-07-04
Retrieval Elasticsearch BM25S: Orders of magnitude faster … 2024-07-04
Question Answering Beam Retrieval End-to-End Beam Retrieval for Multi-Hop … 2023-08-17
Question Answering Chain-of-Skills Chain-of-Skills: A Configurable Model for … 2023-05-04
Question Answering AISO Adaptive Information Seeking for Open-Domain … 2021-09-14
Question Answering HopRetriever + Sp-search HopRetriever: Retrieve Hops over Wikipedia … 2020-12-31
Question Answering IRRR+ Answering Open-Domain Questions of Varying … 2020-10-23
Question Answering IRRR Answering Open-Domain Questions of Varying … 2020-10-23
Question Answering Recursive Dense Retriever Answering Complex Open-Domain Questions with … 2020-09-27
Question Answering DDRQA Answering Any-hop Open-domain Questions with … 2020-09-16
Question Answering BigBird-etc Big Bird: Transformers for Longer … 2020-07-28
Question Answering Quark + SemanticRetrievalMRS IR A Simple Yet Strong Pipeline … 2020-04-14
Question Answering Robustly Fine-tuned Graph-based Recurrent Retriever Learning to Retrieve Reasoning Paths … 2019-11-24
Question Answering HGN + SemanticRetrievalMRS IR Hierarchical Graph Network for Multi-hop … 2019-11-09
Question Answering KGNN Multi-Paragraph Reasoning with Knowledge-enhanced Graph … 2019-11-06
Question Answering GoldEn Retriever Answering Complex Open-domain Questions Through … 2019-10-15
Question Answering SemanticRetrievalMRS Revealing the Importance of Semantic … 2019-09-17
Question Answering MUPPET Multi-Hop Paragraph Retrieval for Open-Domain … 2019-06-15
Question Answering DecompRC Multi-hop Reading Comprehension through Question … 2019-06-07

Research Papers

Recent papers with results on this dataset: