TriviaQA

Name: TriviaQA
Published: 2017-01-01
License: Unknown

Dataset Information

Modalities

Texts

Languages

English

Introduced

2017

License

Unknown

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

TriviaQA is a realistic text-based question answering dataset which includes 950K question-answer pairs from 662K documents collected from Wikipedia and the web. This dataset is more challenging than standard QA benchmark datasets such as Stanford Question Answering Dataset (SQuAD), as the answers for a question may not be directly obtained by span prediction and the context is very long. TriviaQA dataset consists of both human-verified and machine-generated QA subsets.

Source: Episodic Memory Reader: Learning What to Rememberfor Question Answering from Streaming Data
Image Source: Joshi et al

Variants: TriviaQA, KILT: TriviaQA

Associated Benchmarks

This dataset is used in 4 benchmarks:

Question Answering - Metrics: EM, F1
Text Generation - Metrics: acc
Question Generation - Metrics: QAE, R-QAE
Open-Domain Question Answering - Metrics: Exact Match

Recent Benchmark Submissions

Task	Model	Paper	Date
Question Answering	Search-o1	Search-o1: Agentic Search-Enhanced Large Reasoning …	2025-01-09
Question Answering	Shakti-LLM (2.5B)	SHAKTI: A 2.5 Billion Parameter …	2024-10-15
Question Answering	RankRAG-llama3-70b (Zero-Shot, KILT)	RankRAG: Unifying Context Ranking with …	2024-07-02
Question Answering	RankRAG-llama3-70b (Zero-Shot, DPR)	RankRAG: Unifying Context Ranking with …	2024-07-02
Question Answering	RankRAG-llama3-8b (Zero-Shot, KILT)	RankRAG: Unifying Context Ranking with …	2024-07-02
Question Answering	DPA-RAG	Understand What LLM Needs: Dual …	2024-06-26
Question Answering	GaC(Qwen2-72B-Instruct + Llama-3-70B-Instruct)	Breaking the Ceiling of the …	2024-06-18
Question Answering	Branch-Train-MiX 4x7B (sampling top-2 experts)	Branch-Train-MiX: Mixing Expert LLMs into …	2024-03-12
Question Answering	ChatQA-1.5-llama3-70b (Zero-Shot, KILT)	ChatQA: Surpassing GPT-4 on Conversational …	2024-01-18
Question Answering	ChatQA-1.5-llama3-8B (Zero-Shot, KILT)	ChatQA: Surpassing GPT-4 on Conversational …	2024-01-18
Question Answering	ChatQA-1.5-llama3-70b (Zero-Shot, DPR)	ChatQA: Surpassing GPT-4 on Conversational …	2024-01-18
Question Answering	Mistral 7B (5-shot)	Mistral 7B	2023-10-10
Question Answering	RA-DIT (Zero-Shot)	RA-DIT: Retrieval-Augmented Dual Instruction Tuning	2023-10-02
Question Answering	LLaMA 2 70B (one-shot)	Llama 2: Open Foundation and …	2023-07-18
Question Answering	PaLM 2-M (one-shot)	PaLM 2 Technical Report	2023-05-17
Question Answering	PaLM 2-S (one-shot)	PaLM 2 Technical Report	2023-05-17
Question Answering	PaLM 2-L (one-shot)	PaLM 2 Technical Report	2023-05-17
Question Answering	GPT-4-0613 (Zero-shot)	GPT-4 Technical Report	2023-03-15
Question Answering	LLaMA 65B (one-shot)	LLaMA: Open and Efficient Foundation …	2023-02-27
Question Answering	LLaMA 65B (zero-shot)	LLaMA: Open and Efficient Foundation …	2023-02-27

Research Papers

Recent papers with results on this dataset:

External Links:

TriviaQA

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview