SWAG

Situations With Adversarial Generations

Dataset Information

Modalities

Texts

Languages

English

Introduced

2018

License

MIT

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

Given a partial description like "she opened the hood of the car," humans can reason about the situation and anticipate what might come next ("then, she examined the engine"). SWAG (Situations With Adversarial Generations) is a large-scale dataset for this task of grounded commonsense inference, unifying natural language inference and physically grounded reasoning.

The dataset consists of 113k multiple choice questions about grounded situations. Each question is a video caption from LSMDC or ActivityNet Captions, with four answer choices about what might happen next in the scene. The correct answer is the (real) video caption for the next event in the video; the three incorrect answers are adversarially generated and human verified, so as to fool machines but not humans. The authors aim for SWAG to be a benchmark for evaluating grounded commonsense NLI and for learning representations.

Source: SWAG
Image Source: Zellers et al

Variants: SWAG

Associated Benchmarks

This dataset is used in 2 benchmarks:

Question Answering - Metrics: Accuracy
Common Sense Reasoning - Metrics: Test, Dev

Recent Benchmark Submissions

Task	Model	Paper	Date
Question Answering	DeBERTaV3large	DeBERTaV3: Improving DeBERTa using ELECTRA-Style …	2021-11-18
Common Sense Reasoning	DeBERTalarge	DeBERTa: Decoding-enhanced BERT with Disentangled …	2020-06-05
Common Sense Reasoning	RoBERTa	RoBERTa: A Robustly Optimized BERT …	2019-07-26
Common Sense Reasoning	BERT-LARGE	BERT: Pre-training of Deep Bidirectional …	2018-10-11
Common Sense Reasoning	ESIM + ELMo	SWAG: A Large-Scale Adversarial Dataset …	2018-08-16
Common Sense Reasoning	ESIM + GloVe	SWAG: A Large-Scale Adversarial Dataset …	2018-08-16

Research Papers

Recent papers with results on this dataset:

External Links:

SWAG

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview