ML Research Wiki / Benchmarks / Common Sense Reasoning / SWAG

SWAG

Common Sense Reasoning Benchmark

Performance Over Time

📊 Showing 5 results | 📏 Metric: Test

Top Performing Models

Rank Model Paper Test Date Code
1 DeBERTalarge DeBERTa: Decoding-enhanced BERT with Disentangled Attention 90.80 2020-06-05 📦 huggingface/transformers 📦 microsoft/DeBERTa 📦 osu-nlp-group/mind2web
2 RoBERTa RoBERTa: A Robustly Optimized BERT Pretraining Approach 89.90 2019-07-26 📦 huggingface/transformers 📦 pytorch/fairseq 📦 PaddlePaddle/PaddleNLP
3 BERT-LARGE BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding 86.30 2018-10-11 📦 huggingface/transformers 📦 tensorflow/models 📦 labmlai/annotated_deep_learning_paper_implementations
4 ESIM + ELMo SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference 59.20 2018-08-16 -
5 ESIM + GloVe SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference 52.70 2018-08-16 -

All Papers (5)