WinoGrande

Name: WinoGrande
License: CC-BY

Dataset Information

Modalities

Texts

License

CC-BY

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

WinoGrande is a large-scale dataset of 44k problems, inspired by the original WSC design, but adjusted to improve both the scale and the hardness of the dataset. The key steps of the dataset construction consist of (1) a carefully designed crowdsourcing procedure, followed by (2) systematic bias reduction using a novel AfLite algorithm that generalizes human-detectable word associations to machine-detectable embedding associations.

Source: WinoGrande: An Adversarial Winograd Schema Challenge at Scale
Image Source: https://winogrande.allenai.org/

Variants: WinoGrande, Winogrande, Winogrande (5-shot), Winogrande TR v0.2, Winogrande TR

Associated Benchmarks

This dataset is used in 4 benchmarks:

Text Generation - Metrics: acc
Common Sense Reasoning - Metrics: Accuracy
parameter-efficient fine-tuning - Metrics: Accuracy (% )
Winogrande - Metrics: Accuracy

Recent Benchmark Submissions

Task	Model	Paper	Date
parameter-efficient fine-tuning	LLaMA2-7b	GIFT-SW: Gaussian noise Injected Fine-Tuning …	2024-08-27
Common Sense Reasoning	LLaMA3 8B+MoSLoRA	Mixture-of-Subspaces in Low-Rank Adaptation	2024-06-16
Common Sense Reasoning	LLaMA-3 8B + MixLoRA	MixLoRA: Enhancing Large Language Models …	2024-04-22
Common Sense Reasoning	LLaMA-2 13B + MixLoRA	MixLoRA: Enhancing Large Language Models …	2024-04-22
Common Sense Reasoning	LLaMA-2 7B + MixLoRA	MixLoRA: Enhancing Large Language Models …	2024-04-22
Common Sense Reasoning	Branch-Train-MiX 4x7B (sampling top-1 expert)	Branch-Train-MiX: Mixing Expert LLMs into …	2024-03-12
parameter-efficient fine-tuning	LLaMA2-7b	DoRA: Weight-Decomposed Low-Rank Adaptation	2024-02-14
Common Sense Reasoning	Mixtral 8x7B (0-shot)	Mixtral of Experts	2024-01-08
Common Sense Reasoning	Mistral 7B (0-shot)	Mixtral of Experts	2024-01-08
Common Sense Reasoning	Camelidae-8×34B	Parameter-Efficient Sparsity Crafting from Dense …	2024-01-05
Common Sense Reasoning	Mistral 7B (0-shot)	Mistral 7B	2023-10-10
Common Sense Reasoning	phi-1.5-web 1.3B (zero-shot)	Textbooks Are All You Need …	2023-09-11
Common Sense Reasoning	T0-3B (CoT fine-tuned)	The CoT Collection: Improving Zero-shot …	2023-05-23
Common Sense Reasoning	PaLM 2-L (1-shot)	PaLM 2 Technical Report	2023-05-17
Common Sense Reasoning	PaLM 2-S (1-shot)	PaLM 2 Technical Report	2023-05-17
Common Sense Reasoning	PaLM 2-M (1-shot)	PaLM 2 Technical Report	2023-05-17
Common Sense Reasoning	LaMini-GPT 1.5B	LaMini-LM: A Diverse Herd of …	2023-04-27
Common Sense Reasoning	GPT-2-XL 1.5B	LaMini-LM: A Diverse Herd of …	2023-04-27
Common Sense Reasoning	LaMini-F-T5 783M	LaMini-LM: A Diverse Herd of …	2023-04-27
Common Sense Reasoning	T5-Large 738M	LaMini-LM: A Diverse Herd of …	2023-04-27

Research Papers

Recent papers with results on this dataset:

GIFT-SW: Gaussian noise Injected Fine-Tuning of Salient Weights for LLMs (2024) -
Mixture-of-Subspaces in Low-Rank Adaptation (2024) -
MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts (2024) -
Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM (2024) -
DoRA: Weight-Decomposed Low-Rank Adaptation (2024) -
Mixtral of Experts (2024) -
Parameter-Efficient Sparsity Crafting from Dense to Mixture-of-Experts for Instruction Tuning on General Tasks (2024) -
Mistral 7B (2023) -
Textbooks Are All You Need II: phi-1.5 technical report (2023) -
The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning (2023) -
PaLM 2 Technical Report (2023) -
LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions (2023) -
Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling (2023) -
BloombergGPT: A Large Language Model for Finance (2023) -
GPT-4 Technical Report (2023) -
LLaMA: Open and Efficient Foundation Language Models (2023) -
Exploring the Benefits of Training Expert Language Models over Instruction Tuning (2023) -
Knowledge-in-Context: Towards Knowledgeable Semi-Parametric Language Models (2022) -
Task Compass: Scaling Multi-task Pre-training with Task Prefix (2022) -
Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners (2022) -

External Links:

WinoGrande

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview