FEVER

Name: FEVER
Published: 2018-03-14
License: Custom

Fact Extraction and VERification

Dataset Information

Modalities

Texts

Languages

English

Introduced

2018

License

Custom

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

FEVER is a publicly available dataset for fact extraction and verification against textual sources.

It consists of 185,445 claims manually verified against the introductory sections of Wikipedia pages and classified as SUPPORTED, REFUTED or NOTENOUGHINFO. For the first two classes, systems and annotators need to also return the combination of sentences forming the necessary evidence supporting or refuting the claim.

The claims were generated by human annotators extracting claims from Wikipedia and mutating them in a variety of ways, some of which were meaning-altering. The verification of each claim was conducted in a separate annotation process by annotators who were aware of the page but not the sentence from which original claim was
extracted and thus in 31.75% of the claims more than one sentence was considered appropriate evidence. Claims require composition of evidence from multiple sentences in 16.82% of cases. Furthermore, in 12.15% of the claims, this evidence was taken from multiple pages.

Source: FEVER: a large-scale dataset for Fact Extraction and VERification

Variants: FEVER

Associated Benchmarks

This dataset is used in 2 benchmarks:

Question Answering - Metrics: EM
Fact Verification - Metrics: Accuracy, FEVER

Recent Benchmark Submissions

Task	Model	Paper	Date
Question Answering	CoA w/o actions	Chain-of-Action: Faithful and Multimodal Question …	2024-03-26
Question Answering	Zero-shot	Chain-of-Action: Faithful and Multimodal Question …	2024-03-26
Question Answering	Self-Ask	Chain-of-Action: Faithful and Multimodal Question …	2024-03-26
Question Answering	CoA	Chain-of-Action: Faithful and Multimodal Question …	2024-03-26
Question Answering	DSP	Chain-of-Action: Faithful and Multimodal Question …	2024-03-26
Question Answering	DSP	DSPy: Compiling Declarative Language Model …	2023-10-05
Question Answering	Self-Ask	Measuring and Narrowing the Compositionality …	2022-10-07
Fact Verification	RoBERTa-Base Joint MSPP	Paragraph-based Transformer Pre-training for Multi-Sentence …	2022-05-02
Fact Verification	RoBERTa-Base Joint MSPP Flexible	Paragraph-based Transformer Pre-training for Multi-Sentence …	2022-05-02
Fact Verification	ProoFVer-SB	ProoFVer: Natural Logic Theorem Proving …	2021-08-25
Fact Verification	RAG	Retrieval-Augmented Generation for Knowledge-Intensive NLP …	2020-05-22
Fact Verification	KGAT	Fine-grained Fact Verification with Kernel …	2019-10-22
Fact Verification	DREAM	Reasoning Over Semantic-Level Graph for …	2019-09-09
Fact Verification	GEAR	GEAR: Graph-based Evidence Aggregating and …	2019-07-22

Research Papers

Recent papers with results on this dataset:

External Links:

FEVER

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview