BLOOMZ
|
Crosslingual Generalization through Multitask Fin…
|
96.30
|
2022-11-03
|
|
Flipped-3B
|
Guess the Instruction! Flipped Learning Makes Lan…
|
95.88
|
2022-10-06
|
|
FLAN 137B (few-shot, k=10)
|
Finetuned Language Models Are Zero-Shot Learners
|
94.70
|
2021-09-03
|
|
T0-3B (CoT fine-tuned)
|
The CoT Collection: Improving Zero-shot and Few-s…
|
94.50
|
2023-05-23
|
|
KiC-770M
|
Knowledge-in-Context: Towards Knowledgeable Semi-…
|
94.40
|
2022-10-28
|
|
FLAN 137B (zero-shot)
|
Finetuned Language Models Are Zero-Shot Learners
|
93.40
|
2021-09-03
|
|
Reading Strategies Model
|
Improving Machine Reading Comprehension with Gene…
|
88.30
|
2018-10-31
|
|
RoE-3B
|
Exploring the Benefits of Training Expert Languag…
|
86.33
|
2023-02-07
|
|
OPT-175B
|
SparseGPT: Massive Language Models Can Be Accurat…
|
79.82
|
2023-01-02
|
|
SparseGPT (175B, 50% Sparsity)
|
SparseGPT: Massive Language Models Can Be Accurat…
|
78.87
|
2023-01-02
|
|
SparseGPT (175B, 4:8 Sparsity)
|
SparseGPT: Massive Language Models Can Be Accurat…
|
77.02
|
2023-01-02
|
|
val-LS-skip
|
A Simple and Effective Approach to the Story Cloz…
|
76.50
|
2018-03-15
|
|
SparseGPT (175B, 2:4 Sparsity)
|
SparseGPT: Massive Language Models Can Be Accurat…
|
76.19
|
2023-01-02
|
|
sMLP – deterministic 9.4B (0-shot)
|
Efficient Language Modeling with Sparse all-MLP
|
74.70
|
2022-03-14
|
|
Switch Transformer 9B
|
Efficient Language Modeling with Sparse all-MLP
|
73.30
|
2022-03-14
|
|
GPT-3 Large 760M (zero-shot)
|
Language Models are Few-Shot Learners
|
72.40
|
2020-05-28
|
|
Gshard 9B
|
Efficient Language Modeling with Sparse all-MLP
|
67.90
|
2022-03-14
|
|
HASH Layers 10B (0-shot)
|
Efficient Language Modeling with Sparse all-MLP
|
64.70
|
2022-03-14
|
|
Base Layers 10B (0-shot)
|
Efficient Language Modeling with Sparse all-MLP
|
61.40
|
2022-03-14
|
|
OPT-175B (50% Sparsity)
|
SparseGPT: Massive Language Models Can Be Accurat…
|
47.10
|
2023-01-02
|
|