MultiRC

Name: MultiRC
Published: 2018-01-01
License: Custom (research-only)

Multi-Sentence Reading Comprehension

Dataset Information

Modalities

Texts

Languages

English

Introduced

2018

License

Custom (research-only)

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

MultiRC (Multi-Sentence Reading Comprehension) is a dataset of short paragraphs and multi-sentence questions, i.e., questions that can be answered by combining information from multiple sentences of the paragraph.
The dataset was designed with three key challenges in mind:
* The number of correct answer-options for each question is not pre-specified. This removes the over-reliance on answer-options and forces them to decide on the correctness of each candidate answer independently of others. In other words, the task is not to simply identify the best answer-option, but to evaluate the correctness of each answer-option individually.
* The correct answer(s) is not required to be a span in the text.
* The paragraphs in the dataset have diverse provenance by being extracted from 7 different domains such as news, fiction, historical text etc., and hence are expected to be more diverse in their contents as compared to single-domain datasets.
The entire corpus consists of around 10K questions (including about 6K multiple-sentence questions). The 60% of the data is released as training and development data. The rest of the data is saved for evaluation and every few months a new unseen additional data is included for evaluation to prevent unintentional overfitting over time.

Source: https://cogcomp.seas.upenn.edu/multirc/
Image Source: https://paperswithcode.com/paper/looking-beyond-the-surface-a-challenge-set/

Variants: MultiRC

Associated Benchmarks

This dataset is used in 1 benchmark:

Question Answering - Metrics: F1, EM

Recent Benchmark Submissions

Task	Model	Paper	Date
Question Answering	PaLM 2-L (one-shot)	PaLM 2 Technical Report	2023-05-17
Question Answering	PaLM 2-S (one-shot)	PaLM 2 Technical Report	2023-05-17
Question Answering	PaLM 2-M (one-shot)	PaLM 2 Technical Report	2023-05-17
Question Answering	Bloomberg GPT 50B (1-shot)	BloombergGPT: A Large Language Model …	2023-03-30
Question Answering	OPT 66B (1-shot)	BloombergGPT: A Large Language Model …	2023-03-30
Question Answering	GPT-NeoX 20B (1-shot)	BloombergGPT: A Large Language Model …	2023-03-30
Question Answering	BLOOM 176B (1-shot)	BloombergGPT: A Large Language Model …	2023-03-30
Question Answering	Hybrid H3 355M (3-shot, logit scoring)	Hungry Hungry Hippos: Towards Language …	2022-12-28
Question Answering	Hybrid H3 125M (3-shot, logit scoring)	Hungry Hungry Hippos: Towards Language …	2022-12-28
Question Answering	Hybrid H3 125M (0-shot, logit scoring)	Hungry Hungry Hippos: Towards Language …	2022-12-28
Question Answering	Hybrid H3 355M (0-shot, logit scoring)	Hungry Hungry Hippos: Towards Language …	2022-12-28
Question Answering	Vega v2 6B (fine-tuned)	Toward Efficient Language Model Pretraining …	2022-12-04
Question Answering	Turing NLR v5 XXL 5.4B (fine-tuned)	Toward Efficient Language Model Pretraining …	2022-12-04
Question Answering	Neo-6B (few-shot)	Ask Me Anything: A simple …	2022-10-05
Question Answering	Neo-6B (QA)	Ask Me Anything: A simple …	2022-10-05
Question Answering	Neo-6B (QA + WS)	Ask Me Anything: A simple …	2022-10-05
Question Answering	AlexaTM 20B	AlexaTM 20B: Few-Shot Learning Using …	2022-08-02
Question Answering	N-Grammer 343M	N-Grammer: Augmenting Transformers with latent …	2022-07-13
Question Answering	PaLM 540B (finetuned)	PaLM: Scaling Language Modeling with …	2022-04-05
Question Answering	ST-MoE-32B 269B (fine-tuned)	ST-MoE: Designing Stable and Transferable …	2022-02-17

Research Papers

Recent papers with results on this dataset:

External Links:

MultiRC

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview