CommonsenseQA

Name: CommonsenseQA
Published: 2019-01-01
License: Unknown

CSQA

Dataset Information

Modalities

Texts

Languages

English

Introduced

2019

License

Unknown

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

The CommonsenseQA is a dataset for commonsense question answering task. The dataset consists of 12,247 questions with 5 choices each.
The dataset was generated by Amazon Mechanical Turk workers in the following process (an example is provided in parentheses):

a crowd worker observes a source concept from ConceptNet (“River”) and three target concepts (“Waterfall”, “Bridge”, “Valley”) that are all related by the same ConceptNet relation (“AtLocation”),
the worker authors three questions, one per target concept, such that only that particular target concept is the answer, while the other two distractor concepts are not, (“Where on a river can you hold a cup upright to catch water on a sunny day?”, “Where can I stand on a river to see water falling without getting wet?”, “I’m crossing the river, my feet are wet but my body is dry, where am I?”)
for each question, another worker chooses one additional distractor from Concept Net (“pebble”, “stream”, “bank”), and the author another distractor (“mountain”, “bottom”, “island”) manually.

Source: CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge
Image Source: CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge

Variants: CommonsenseQA

Associated Benchmarks

This dataset is used in 1 benchmark:

Common Sense Reasoning - Metrics: Accuracy

Recent Benchmark Submissions

Task	Model	Paper	Date
Common Sense Reasoning	GPT-4o (HPT)	Hierarchical Prompting Taxonomy: A Universal …	2024-06-18
Common Sense Reasoning	PaLM 2 (few‑shot, CoT, SC)	PaLM 2 Technical Report	2023-05-17
Common Sense Reasoning	OPT 66B (1-shot)	BloombergGPT: A Large Language Model …	2023-03-30
Common Sense Reasoning	GPT-NeoX 20B (1-shot)	BloombergGPT: A Large Language Model …	2023-03-30
Common Sense Reasoning	BLOOM 176B (1-shot)	BloombergGPT: A Large Language Model …	2023-03-30
Common Sense Reasoning	Bloomberg GPT 50B (1-shot)	BloombergGPT: A Large Language Model …	2023-03-30
Common Sense Reasoning	GrapeQA: PEGA	GrapeQA: GRaph Augmentation and Pruning …	2023-03-22
Common Sense Reasoning	DRAGON	Deep Bidirectional Language-Knowledge Graph Pretraining	2022-10-17
Common Sense Reasoning	UL2 20B (chain-of-thought + self-consistency)	UL2: Unifying Language Learning Paradigms	2022-05-10
Common Sense Reasoning	UL2 20B (zero-shot)	UL2: Unifying Language Learning Paradigms	2022-05-10
Common Sense Reasoning	UL2 20B (chain-of-thought)	UL2: Unifying Language Learning Paradigms	2022-05-10
Common Sense Reasoning	Few-shot Direct GPT-J	STaR: Bootstrapping Reasoning With Reasoning	2022-03-28
Common Sense Reasoning	Few-shot CoT GPT-J	STaR: Bootstrapping Reasoning With Reasoning	2022-03-28
Common Sense Reasoning	STaR (on GPT-J)	STaR: Bootstrapping Reasoning With Reasoning	2022-03-28
Common Sense Reasoning	STaR without Rationalization (on GPT-J)	STaR: Bootstrapping Reasoning With Reasoning	2022-03-28
Common Sense Reasoning	GPT-J Direct Finetuned	STaR: Bootstrapping Reasoning With Reasoning	2022-03-28
Common Sense Reasoning	Few-shot CoT LaMDA 137B	STaR: Bootstrapping Reasoning With Reasoning	2022-03-28
Common Sense Reasoning	Chain of thought ASDiv	Chain-of-Thought Prompting Elicits Reasoning in …	2022-01-28
Common Sense Reasoning	GPT-3 Direct Finetuned	Human Parity on CommonsenseQA: Augmenting …	2021-12-06
Common Sense Reasoning	KEAR	Human Parity on CommonsenseQA: Augmenting …	2021-12-06

Research Papers

Recent papers with results on this dataset:

External Links:

CommonsenseQA

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview