CommonsenseQA

CSQA

Dataset Information
Modalities
Texts
Languages
English
Introduced
2019
License
Unknown
Homepage

Overview

The CommonsenseQA is a dataset for commonsense question answering task. The dataset consists of 12,247 questions with 5 choices each.
The dataset was generated by Amazon Mechanical Turk workers in the following process (an example is provided in parentheses):

  1. a crowd worker observes a source concept from ConceptNet (“River”) and three target concepts (“Waterfall”, “Bridge”, “Valley”) that are all related by the same ConceptNet relation (“AtLocation”),
  2. the worker authors three questions, one per target concept, such that only that particular target concept is the answer, while the other two distractor concepts are not, (“Where on a river can you hold a cup upright to catch water on a sunny day?”, “Where can I stand on a river to see water falling without getting wet?”, “I’m crossing the river, my feet are wet but my body is dry, where am I?”)
  3. for each question, another worker chooses one additional distractor from Concept Net (“pebble”, “stream”, “bank”), and the author another distractor (“mountain”, “bottom”, “island”) manually.

Source: CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge
Image Source: CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge

Variants: CommonsenseQA

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Common Sense Reasoning GPT-4o (HPT) Hierarchical Prompting Taxonomy: A Universal … 2024-06-18
Common Sense Reasoning PaLM 2 (few‑shot, CoT, SC) PaLM 2 Technical Report 2023-05-17
Common Sense Reasoning OPT 66B (1-shot) BloombergGPT: A Large Language Model … 2023-03-30
Common Sense Reasoning GPT-NeoX 20B (1-shot) BloombergGPT: A Large Language Model … 2023-03-30
Common Sense Reasoning BLOOM 176B (1-shot) BloombergGPT: A Large Language Model … 2023-03-30
Common Sense Reasoning Bloomberg GPT 50B (1-shot) BloombergGPT: A Large Language Model … 2023-03-30
Common Sense Reasoning GrapeQA: PEGA GrapeQA: GRaph Augmentation and Pruning … 2023-03-22
Common Sense Reasoning DRAGON Deep Bidirectional Language-Knowledge Graph Pretraining 2022-10-17
Common Sense Reasoning UL2 20B (chain-of-thought + self-consistency) UL2: Unifying Language Learning Paradigms 2022-05-10
Common Sense Reasoning UL2 20B (zero-shot) UL2: Unifying Language Learning Paradigms 2022-05-10
Common Sense Reasoning UL2 20B (chain-of-thought) UL2: Unifying Language Learning Paradigms 2022-05-10
Common Sense Reasoning Few-shot Direct GPT-J STaR: Bootstrapping Reasoning With Reasoning 2022-03-28
Common Sense Reasoning Few-shot CoT GPT-J STaR: Bootstrapping Reasoning With Reasoning 2022-03-28
Common Sense Reasoning STaR (on GPT-J) STaR: Bootstrapping Reasoning With Reasoning 2022-03-28
Common Sense Reasoning STaR without Rationalization (on GPT-J) STaR: Bootstrapping Reasoning With Reasoning 2022-03-28
Common Sense Reasoning GPT-J Direct Finetuned STaR: Bootstrapping Reasoning With Reasoning 2022-03-28
Common Sense Reasoning Few-shot CoT LaMDA 137B STaR: Bootstrapping Reasoning With Reasoning 2022-03-28
Common Sense Reasoning Chain of thought ASDiv Chain-of-Thought Prompting Elicits Reasoning in … 2022-01-28
Common Sense Reasoning GPT-3 Direct Finetuned Human Parity on CommonsenseQA: Augmenting … 2021-12-06
Common Sense Reasoning KEAR Human Parity on CommonsenseQA: Augmenting … 2021-12-06

Research Papers

Recent papers with results on this dataset: