SIQA

Name: SIQA
Published: 2019-04-22
License: Unknown

Social Interaction QA

Dataset Information

Modalities

Texts

Introduced

2019

License

Unknown

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

Social Interaction QA (SIQA) is a question-answering benchmark for testing social commonsense intelligence. Contrary to many prior benchmarks that focus on physical or taxonomic knowledge, Social IQa focuses on reasoning about people’s actions and their social implications. For example, given an action like "Jesse saw a concert" and a question like "Why did Jesse do this?", humans can easily infer that Jesse wanted "to see their favorite performer" or "to enjoy the music", and not "to see what's happening inside" or "to see if it works". The actions in Social IQa span a wide variety of social situations, and answer candidates contain both human-curated answers and adversarially-filtered machine-generated candidates. Social IQa contains over 37,000 QA pairs for evaluating models’ abilities to reason about the social implications of everyday events and situations.

Source: Social IQA
Image Source: https://arxiv.org/pdf/1904.09728.pdf

Variants: SIQA

Associated Benchmarks

This dataset is used in 1 benchmark:

Question Answering - Metrics: Accuracy

Recent Benchmark Submissions

Task	Model	Paper	Date
Question Answering	LLaMA-3 8B+MoSLoRA (fine-tuned)	Mixture-of-Subspaces in Low-Rank Adaptation	2024-06-16
Question Answering	LLaMA-2 13B + MixLoRA	MixLoRA: Enhancing Large Language Models …	2024-04-22
Question Answering	LLaMA-2 7B + MixLoRA	MixLoRA: Enhancing Large Language Models …	2024-04-22
Question Answering	LLaMA-3 8B + MixLoRA	MixLoRA: Enhancing Large Language Models …	2024-04-22
Question Answering	phi-1.5 1.3B (zero-shot)	Textbooks Are All You Need …	2023-09-11
Question Answering	phi-1.5-web 1.3B (zero-shot)	Textbooks Are All You Need …	2023-09-11
Question Answering	LLaMA 13B (zero-shot)	LLaMA: Open and Efficient Foundation …	2023-02-27
Question Answering	LLaMA 33B (zero-shot)	LLaMA: Open and Efficient Foundation …	2023-02-27
Question Answering	LLaMA 7B (zero-shot)	LLaMA: Open and Efficient Foundation …	2023-02-27
Question Answering	LLaMA 65B (zero-shot)	LLaMA: Open and Efficient Foundation …	2023-02-27
Question Answering	DeBERTa-Large 304M	Two is Better than Many? …	2022-10-29
Question Answering	DeBERTa-Large 304M (classification-based)	Two is Better than Many? …	2022-10-29
Question Answering	CompassMTL 567M	Task Compass: Scaling Multi-task Pre-training …	2022-10-12
Question Answering	ExDeBERTa 567M	Task Compass: Scaling Multi-task Pre-training …	2022-10-12
Question Answering	CompassMTL 567M with Tailor	Task Compass: Scaling Multi-task Pre-training …	2022-10-12
Question Answering	Chinchilla (zero-shot)	Training Compute-Optimal Large Language Models	2022-03-29
Question Answering	Gopher (zero-shot)	Scaling Language Models: Methods, Analysis …	2021-12-08
Question Answering	Unicorn 11B (fine-tuned)	UNICORN on RAINBOW: A Universal …	2021-03-24
Question Answering	UnifiedQA 3B	UnifiedQA: Crossing Format Boundaries With …	2020-05-02
Question Answering	RoBERTa-Large 355M (fine-tuned)	RoBERTa: A Robustly Optimized BERT …	2019-07-26

Research Papers

Recent papers with results on this dataset:

External Links:

SIQA

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview