DuoRC

Name: DuoRC
Published: 2018-04-21
License: Unknown

Dataset Information

Modalities

Texts

Languages

English

Introduced

2018

License

Unknown

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

DuoRC contains 186,089 unique question-answer pairs created from a collection of 7680 pairs of movie plots where each pair in the collection reflects two versions of the same movie.

Why another RC dataset?

DuoRC pushes the NLP community to address challenges on incorporating knowledge and reasoning in neural architectures for reading comprehension. It poses several interesting challenges such as:

DuoRC using parallel plots is especially designed to contain a large number of questions with low lexical overlap between questions and their corresponding passages
It requires models to go beyond the content of the given passage itself and incorporate world-knowledge, background knowledge, and common-sense knowledge to arrive at the answer
It revolves around narrative passages from movie plots describing complex events and therefore naturally require complex reasoning (e.g. temporal reasoning, entailment, long-distance anaphoras, etc.) across multiple sentences to infer the answer to questions
Several of the questions in DuoRC, while seeming relevant, cannot actually be answered from the given passage. This requires the model to detect the unanswerability of questions. This aspect is important for machines to achieve in industrial settings in particular

Source: DuoRC

Variants: DuoRC

Associated Benchmarks

This dataset is used in 1 benchmark:

Question Answering - Metrics: Accuracy

Recent Benchmark Submissions

Task	Model	Paper	Date
Question Answering	Vector Database (ChromaDB)	RecallM: An Adaptable Memory Mechanism …	2023-07-06
Question Answering	Hybrid-RecallM	RecallM: An Adaptable Memory Mechanism …	2023-07-06
Question Answering	RecallM	RecallM: An Adaptable Memory Mechanism …	2023-07-06

Research Papers

Recent papers with results on this dataset:

RecallM: An Adaptable Memory Mechanism with Temporal Understanding for Large Language Models (2023) -

External Links:

DuoRC

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview