Unicorn 11B (fine-tuned)
|
UNICORN on RAINBOW: A Universal Commonsense Reaso…
|
83.20
|
2021-03-24
|
|
LLaMA-2 13B + MixLoRA
|
MixLoRA: Enhancing Large Language Models Fine-Tun…
|
82.50
|
2024-04-22
|
|
CompassMTL 567M with Tailor
|
Task Compass: Scaling Multi-task Pre-training wit…
|
82.20
|
2022-10-12
|
|
CompassMTL 567M
|
Task Compass: Scaling Multi-task Pre-training wit…
|
81.70
|
2022-10-12
|
|
LLaMA-3 8B+MoSLoRA (fine-tuned)
|
Mixture-of-Subspaces in Low-Rank Adaptation
|
81.00
|
2024-06-16
|
|
DeBERTa-Large 304M
|
Two is Better than Many? Binary Classification as…
|
80.20
|
2022-10-29
|
|
DeBERTa-Large 304M (classification-based)
|
Two is Better than Many? Binary Classification as…
|
79.90
|
2022-10-29
|
|
UnifiedQA 3B
|
UnifiedQA: Crossing Format Boundaries With a Sing…
|
79.80
|
2020-05-02
|
|
ExDeBERTa 567M
|
Task Compass: Scaling Multi-task Pre-training wit…
|
79.60
|
2022-10-12
|
|
LLaMA-3 8B + MixLoRA
|
MixLoRA: Enhancing Large Language Models Fine-Tun…
|
78.80
|
2024-04-22
|
|
LLaMA-2 7B + MixLoRA
|
MixLoRA: Enhancing Large Language Models Fine-Tun…
|
78.00
|
2024-04-22
|
|
RoBERTa-Large 355M (fine-tuned)
|
RoBERTa: A Robustly Optimized BERT Pretraining Ap…
|
76.70
|
2019-07-26
|
|
BERT-large 340M (fine-tuned)
|
SocialIQA: Commonsense Reasoning about Social Int…
|
64.50
|
2019-04-22
|
|
BERT-base 110M (fine-tuned)
|
SocialIQA: Commonsense Reasoning about Social Int…
|
63.10
|
2019-04-22
|
|
GPT-1 117M (fine-tuned)
|
SocialIQA: Commonsense Reasoning about Social Int…
|
63.00
|
2019-04-22
|
|
phi-1.5-web 1.3B (zero-shot)
|
Textbooks Are All You Need II: phi-1.5 technical …
|
53.00
|
2023-09-11
|
|
phi-1.5 1.3B (zero-shot)
|
Textbooks Are All You Need II: phi-1.5 technical …
|
52.60
|
2023-09-11
|
|
LLaMA 65B (zero-shot)
|
LLaMA: Open and Efficient Foundation Language Mod…
|
52.30
|
2023-02-27
|
|
Chinchilla (zero-shot)
|
Training Compute-Optimal Large Language Models
|
51.30
|
2022-03-29
|
|
Gopher (zero-shot)
|
Scaling Language Models: Methods, Analysis & Insi…
|
50.60
|
2021-12-08
|
|
LLaMA 13B (zero-shot)
|
LLaMA: Open and Efficient Foundation Language Mod…
|
50.40
|
2023-02-27
|
|
LLaMA 33B (zero-shot)
|
LLaMA: Open and Efficient Foundation Language Mod…
|
50.40
|
2023-02-27
|
|
LLaMA 7B (zero-shot)
|
LLaMA: Open and Efficient Foundation Language Mod…
|
48.90
|
2023-02-27
|
|
Random chance baseline
|
SocialIQA: Commonsense Reasoning about Social Int…
|
33.30
|
2019-04-22
|
|