TQA

Textbook Question Answering

Dataset Information
Modalities
Texts
Introduced
2017
License
Homepage

Overview

The TextbookQuestionAnswering (TQA) dataset is drawn from middle school science curricula. It consists of 1,076 lessons from Life Science, Earth Science and Physical Science textbooks. This includes 26,260 questions, including 12,567 that have an accompanying diagram.

The TQA dataset encourages work on the task of Multi-Modal Machine Comprehension (M3C) task. The M3C task builds on the popular Visual Question Answering (VQA) and Machine Comprehension (MC) paradigms by framing question answering as a machine comprehension task, where the context needed to answer questions is provided and composed of both text and images. The dataset constructed to showcase this task has been built from a middle school science curriculum that pairs a given question to a limited span of knowledge needed to answer it.

Source: Allen Institute for AI

Variants: TQA

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Open-Domain Question Answering BPR (linear scan; l=1000) Efficient Passage Retrieval with Hashing … 2021-06-02
Open-Domain Question Answering UniK-QA UniK-QA: Unified Representations of Structured … 2020-12-29

Research Papers

Recent papers with results on this dataset: