ML Research Wiki / Benchmarks / Natural Language Inference / RTE

RTE

Natural Language Inference Benchmark

Performance Over Time

📊 Showing 89 results | 📏 Metric: Accuracy

Top Performing Models

Rank	Model	Paper	Accuracy	Date	Code
1	PSQ (Chen et al., 2020)	A Statistical Framework for Low-bitwidth Training of Deep Neural Networks	86.80	2020-10-27	📦 cjf00000/StatQuant 📦 gaochang-bjtu/1-bit-fqt
2	Q8BERT (Zafrir et al., 2019)	Q8BERT: Quantized 8Bit BERT	84.80	2019-10-14	📦 NervanaSystems/nlp-architect 📦 intellabs/model-compression-research-package 📦 huggingface/block_movement_pruning 📦 mindspore-ai/models 📦 iabd/QuantizedNMT
3	Q-BERT (Shen et al., 2020)	Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT	84.70	2019-09-12	-
4	KiC-770M	Knowledge-in-Context: Towards Knowledgeable Semi-Parametric Language Models	74.00	2022-10-28	-
5	Flipped-3B	Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners	71.05	2022-10-06	📦 seonghyeonye/flipped-learning
6	RoE-3B	Exploring the Benefits of Training Expert Language Models over Instruction Tuning	64.01	2023-02-07	📦 joeljang/rlphf 📦 joeljang/elm
7	ELC-BERT-base 98M (zero init)	Not all layers are equally as important: Every Layer Counts BERT	63.00	2023-11-03	-
8	ELC-BERT-small 24M	Not all layers are equally as important: Every Layer Counts BERT	55.40	2023-11-03	-
9	LTG-BERT-base 98M	Not all layers are equally as important: Every Layer Counts BERT	54.70	2023-11-03	-
10	LTG-BERT-small 24M	Not all layers are equally as important: Every Layer Counts BERT	53.70	2023-11-03	-

All Papers (89)

A Statistical Framework for Low-bitwidth Training of Deep Neural Networks

2020

PSQ (Chen et al., 2020)

cjf00000/StatQuant gaochang-bjtu/1-bit-fqt

Q8BERT: Quantized 8Bit BERT

2019

Q8BERT (Zafrir et al., 2019)

NervanaSystems/nlp-architect intellabs/model-compression-research-package

Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT

2019

Q-BERT (Shen et al., 2020)

Knowledge-in-Context: Towards Knowledgeable Semi-Parametric Language Models

2022

KiC-770M

Guess the Instruction! Flipped Learning Makes Language Models Stronger Zero-Shot Learners

2022

Flipped-3B

seonghyeonye/flipped-learning

Exploring the Benefits of Training Expert Language Models over Instruction Tuning

2023

RoE-3B

joeljang/rlphf joeljang/elm

Not all layers are equally as important: Every Layer Counts BERT

2023

ELC-BERT-base 98M (zero init)

Not all layers are equally as important: Every Layer Counts BERT

2023

ELC-BERT-small 24M

Not all layers are equally as important: Every Layer Counts BERT

2023

LTG-BERT-base 98M

Not all layers are equally as important: Every Layer Counts BERT

2023

LTG-BERT-small 24M

PaLM 2 Technical Report

2023

PaLM 2-S (1-shot)

eternityyw/tram-benchmark

Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE

2022

Vega v2 6B (KD-based prompt transfer)

PaLM: Scaling Language Modeling with Pathways

2022

PaLM 540B (fine-tuned)

lucidrains/CoCa-pytorch lucidrains/PaLM-pytorch

Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE

2022

Turing NLR v5 XXL 5.4B (fine-tuned)

ST-MoE: Designing Stable and Transferable Sparse Expert Models

2022

ST-MoE-32B 269B (fine-tuned)

tensorflow/mesh xuefuzhao/openmoe yikangshen/megablocks

DeBERTa: Decoding-enhanced BERT with Disentangled Attention

2020

DeBERTa-1.5B

huggingface/transformers microsoft/DeBERTa

Muppet: Massive Multi-task Representations with Pre-Finetuning

2021

MUPPET Roberta Large

facebook/muppet-roberta-base facebook/muppet-roberta-large

DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing

2021

DeBERTaV3large

microsoft/DeBERTa dashenzi721/hra stareru/csqa_debertav3

SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization

2019

T5-XXL 11B

namisan/mt-dnn microsoft/MT-DNN

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

2019

T5-XXL 11B (fine-tuned)

huggingface/transformers PaddlePaddle/PaddleNLP

ST-MoE: Designing Stable and Transferable Sparse Expert Models

2022

ST-MoE-L 4.1B (fine-tuned)

tensorflow/mesh xuefuzhao/openmoe yikangshen/megablocks

UL2: Unifying Language Learning Paradigms

2022

UL2 20B (fine-tuned)

google-research/google-research opennlg/openba-v2

SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization

2019

SMARTRoBERTa

namisan/mt-dnn microsoft/MT-DNN

Finetuned Language Models Are Zero-Shot Learners

2021

FLAN 137B (prompt-tuned)

hiyouga/llama-efficient-tuning bigcode-project/starcoder

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

2019

T5-XL 3B

huggingface/transformers PaddlePaddle/PaddleNLP

Entailment as Few-Shot Learner

2021

RoBERTa-large 355M + Entailment as Few-shot Learner

PaddlePaddle/PaddleNLP sunyilgdx/prompts4keras cactilab/hateguard

ALBERT: A Lite BERT for Self-supervised Learning of Language Representations

2019

ALBERT

huggingface/transformers tensorflow/models

StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding

2019

Adv-RoBERTa ensemble

RoBERTa: A Robustly Optimized BERT Pretraining Approach

2019

RoBERTa

huggingface/transformers pytorch/fairseq

RoBERTa: A Robustly Optimized BERT Pretraining Approach

2019

RoBERTa (ensemble)

huggingface/transformers pytorch/fairseq

LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions

2023

T5-Large 738M

mbzuai-nlp/lamini-lm

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

2019

T5-Large 770M

huggingface/transformers PaddlePaddle/PaddleNLP

Entailment as Few-Shot Learner

2021

RoBERTa-large 355M + EFL + UCA

PaddlePaddle/PaddleNLP sunyilgdx/prompts4keras cactilab/hateguard

PaLM: Scaling Language Modeling with Pathways

2022

PaLM 540B (1-shot)

lucidrains/CoCa-pytorch lucidrains/PaLM-pytorch

XLNet: Generalized Autoregressive Pretraining for Language Understanding

2019

XLNet (single model)

huggingface/transformers PaddlePaddle/PaddleNLP

LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale

2022

RoBERTa-large 355M (MLP quantized vector-wise, fine-tuned)

timdettmers/bitsandbytes huggingface/transformers-bloom-inference

OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization

2022

OPT-IML 175B

tanyuqian/cappy

Finetuned Language Models Are Zero-Shot Learners

2021

FLAN 137B (8-shot)

hiyouga/llama-efficient-tuning bigcode-project/starcoder

Finetuned Language Models Are Zero-Shot Learners

2021

FLAN 137B (0-shot)

hiyouga/llama-efficient-tuning bigcode-project/starcoder

OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization

2022

OPT-IML 30B

tanyuqian/cappy

PaLM 2 Technical Report

2023

PaLM 2-M (1-shot)

eternityyw/tram-benchmark

The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning

2023

T0-3B (CoT fine-tuned)

kaistai/cot-collection kaist-lklab/cot-collection

ERNIE 2.0: A Continual Pre-training Framework for Language Understanding

2019

ERNIE 2.0 Large

PaddlePaddle/PaddleNLP PaddlePaddle/ERNIE DataScienceNigeria/ERNIE-2.0-from-Baidu-Inc.

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

2019

T5-Base 220M

huggingface/transformers PaddlePaddle/PaddleNLP

CLEAR: Contrastive Learning for Sentence Representation

2020

MLM+ del-span

PaLM: Scaling Language Modeling with Pathways

2022

PaLM 540B (5-shot)

lucidrains/CoCa-pytorch lucidrains/PaLM-pytorch

PaLM 2 Technical Report

2023

PaLM 2-L (1-shot)

eternityyw/tram-benchmark

SpanBERT: Improving Pre-training by Representing and Predicting Spans

2019

SpanBERT

facebookresearch/SpanBERT mandarjoshi90/coref

Ask Me Anything: A simple strategy for prompting language models

2022

Neo-6B (QA + WS)

hazyresearch/ama_prompting simran-arora/privacy_fm simran-arora/focus

Big Bird: Transformers for Longer Sequences

2020

BigBird

huggingface/transformers tensorflow/models

ERNIE 2.0: A Continual Pre-training Framework for Language Understanding

2019

ERNIE 2.0 Base

PaddlePaddle/PaddleNLP PaddlePaddle/ERNIE DataScienceNigeria/ERNIE-2.0-from-Baidu-Inc.

RealFormer: Transformer Likes Residual Attention

2020

RealFormer

google-research/google-research cloneofsimo/RealFormer-pytorch

SqueezeBERT: What can computer vision teach NLP about efficient neural networks?

2020

SqueezeBERT

huggingface/transformers huggingface/transformers

PaLM: Scaling Language Modeling with Pathways

2022

PaLM 540B (0-shot)

lucidrains/CoCa-pytorch lucidrains/PaLM-pytorch

SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization

2019

SMART-BERT

namisan/mt-dnn microsoft/MT-DNN

SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization

2019

SMART

namisan/mt-dnn microsoft/MT-DNN

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

2018

BERT-large 340M

huggingface/transformers tensorflow/models

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

2019

T5-Small

huggingface/transformers PaddlePaddle/PaddleNLP

TinyBERT: Distilling BERT for Natural Language Understanding

2019

TinyBERT-4 14.5M

PaddlePaddle/PaddleNLP huawei-noah/Pretrained-Language-Model

data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language

2022

data2vec

huggingface/transformers pytorch/fairseq

BloombergGPT: A Large Language Model for Finance

2023

Bloomberg GPT 50B (1-shot)

yangletliu/finlora open-finance-lab/finlora

FNet: Mixing Tokens with Fourier Transforms

2021

FNet-Large

labmlai/annotated_deep_learning_paper_implementations google-research/google-research

Language Models are Few-Shot Learners

2020

GPT-3 175B (few-shot, k=32)

ggml-org/llama.cpp ggerganov/llama.cpp

ERNIE: Enhanced Language Representation with Informative Entities

2019

ERNIE

thunlp/ERNIE Mind23-2/MindCode-136

AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model

2022

AlexaTM 20B

amazon-science/alexa-teacher-models

LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions

2023

LaMini-GPT 1.5B

mbzuai-nlp/lamini-lm

SenseBERT: Driving Some Sense into BERT

2019

SenseBERT-base 110M

OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization

2022

OPT-IML 1.3B

tanyuqian/cappy

TinyBERT: Distilling BERT for Natural Language Understanding

2019

TinyBERT-6 67M

PaddlePaddle/PaddleNLP huawei-noah/Pretrained-Language-Model

LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions

2023

LaMini-F-T5 783M

mbzuai-nlp/lamini-lm

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

2019

DistilBERT 66M

huggingface/transformers PaddlePaddle/PaddleNLP

Ask Me Anything: A simple strategy for prompting language models

2022

Neo-6B (QA)

hazyresearch/ama_prompting simran-arora/privacy_fm simran-arora/focus

UL2: Unifying Language Learning Paradigms

2022

UL2 20B (0-shot)

google-research/google-research opennlg/openba-v2

OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization

2022

OPT 175B

tanyuqian/cappy

N-Grammer: Augmenting Transformers with latent n-grams

2022

N-Grammer 343M

tensorflow/lingvo yiyixuxu/n-grammer-flax

Hungry Hungry Hippos: Towards Language Modeling with State Space Models

2022

Hybrid H3 125M (0-shot, logit scoring)

hazyresearch/safari hazyresearch/h3 lindermanlab/S5

Ask Me Anything: A simple strategy for prompting language models

2022

Neo-6B (few-shot)

hazyresearch/ama_prompting simran-arora/privacy_fm simran-arora/focus

OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization

2022

OPT 30B

tanyuqian/cappy

Hungry Hungry Hippos: Towards Language Modeling with State Space Models

2022

Hybrid H3 125M (3-shot, logit scoring)

hazyresearch/safari hazyresearch/h3 lindermanlab/S5

Hungry Hungry Hippos: Towards Language Modeling with State Space Models

2022

Hybrid H3 125M (3-shot, rank classification)

hazyresearch/safari hazyresearch/h3 lindermanlab/S5

How to Train BERT with an Academic Budget

2021

24hBERT

peteriz/academic-budget-bert IntelLabs/academic-budget-bert

BloombergGPT: A Large Language Model for Finance

2023

BLOOM 176B (1-shot)

yangletliu/finlora open-finance-lab/finlora

LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions

2023

LaMini-T5 738M

mbzuai-nlp/lamini-lm

BloombergGPT: A Large Language Model for Finance

2023

OPT 66B (1-shot)

yangletliu/finlora open-finance-lab/finlora

OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization

2022

OPT 1.3B

tanyuqian/cappy

BloombergGPT: A Large Language Model for Finance

2023

GPT-NeoX 20B (1-shot)

yangletliu/finlora open-finance-lab/finlora

Hungry Hungry Hippos: Towards Language Modeling with State Space Models

2022

H3 125M (0-shot, rank classification)

hazyresearch/safari hazyresearch/h3 lindermanlab/S5

LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions

2023

GPT-2-XL 1.5B

mbzuai-nlp/lamini-lm

Hungry Hungry Hippos: Towards Language Modeling with State Space Models

2022

H3 125M (3-shot, rank classification)

hazyresearch/safari hazyresearch/h3 lindermanlab/S5

Model	Paper	Accuracy	Date
PSQ (Chen et al., 2020)	A Statistical Framework for Low-bitwidth Training…	86.80	2020-10-27
Q8BERT (Zafrir et al., 2019)	Q8BERT: Quantized 8Bit BERT	84.80	2019-10-14
Q-BERT (Shen et al., 2020)	Q-BERT: Hessian Based Ultra Low Precision Quantiz…	84.70	2019-09-12
KiC-770M	Knowledge-in-Context: Towards Knowledgeable Semi-…	74.00	2022-10-28
Flipped-3B	Guess the Instruction! Flipped Learning Makes Lan…	71.05	2022-10-06
RoE-3B	Exploring the Benefits of Training Expert Languag…	64.01	2023-02-07
ELC-BERT-base 98M (zero init)	Not all layers are equally as important: Every La…	63.00	2023-11-03
ELC-BERT-small 24M	Not all layers are equally as important: Every La…	55.40	2023-11-03
LTG-BERT-base 98M	Not all layers are equally as important: Every La…	54.70	2023-11-03
LTG-BERT-small 24M	Not all layers are equally as important: Every La…	53.70	2023-11-03
PaLM 2-S (1-shot)	PaLM 2 Technical Report		2023-05-17
Vega v2 6B (KD-based prompt transfer)	Toward Efficient Language Model Pretraining and D…		2022-12-04
PaLM 540B (fine-tuned)	PaLM: Scaling Language Modeling with Pathways		2022-04-05
Turing NLR v5 XXL 5.4B (fine-tuned)	Toward Efficient Language Model Pretraining and D…		2022-12-04
ST-MoE-32B 269B (fine-tuned)	ST-MoE: Designing Stable and Transferable Sparse …		2022-02-17
DeBERTa-1.5B	DeBERTa: Decoding-enhanced BERT with Disentangled…		2020-06-05
MUPPET Roberta Large	Muppet: Massive Multi-task Representations with P…		2021-01-26
DeBERTaV3large	DeBERTaV3: Improving DeBERTa using ELECTRA-Style …		2021-11-18
T5-XXL 11B	SMART: Robust and Efficient Fine-Tuning for Pre-t…		2019-11-08
T5-XXL 11B (fine-tuned)	Exploring the Limits of Transfer Learning with a …		2019-10-23
ST-MoE-L 4.1B (fine-tuned)	ST-MoE: Designing Stable and Transferable Sparse …		2022-02-17
UL2 20B (fine-tuned)	UL2: Unifying Language Learning Paradigms		2022-05-10
SMARTRoBERTa	SMART: Robust and Efficient Fine-Tuning for Pre-t…		2019-11-08
FLAN 137B (prompt-tuned)	Finetuned Language Models Are Zero-Shot Learners		2021-09-03
T5-XL 3B	Exploring the Limits of Transfer Learning with a …		2019-10-23
RoBERTa-large 355M + Entailment as Few-shot Learner	Entailment as Few-Shot Learner		2021-04-29
ALBERT	ALBERT: A Lite BERT for Self-supervised Learning …		2019-09-26
Adv-RoBERTa ensemble	StructBERT: Incorporating Language Structures int…		2019-08-13
RoBERTa	RoBERTa: A Robustly Optimized BERT Pretraining Ap…		2019-07-26
RoBERTa (ensemble)	RoBERTa: A Robustly Optimized BERT Pretraining Ap…		2019-07-26
T5-Large 738M	LaMini-LM: A Diverse Herd of Distilled Models fro…		2023-04-27
T5-Large 770M	Exploring the Limits of Transfer Learning with a …		2019-10-23
RoBERTa-large 355M + EFL + UCA	Entailment as Few-Shot Learner		2021-04-29
PaLM 540B (1-shot)	PaLM: Scaling Language Modeling with Pathways		2022-04-05
XLNet (single model)	XLNet: Generalized Autoregressive Pretraining for…		2019-06-19
RoBERTa-large 355M (MLP quantized vector-wise, fine-tuned)	LLM.int8(): 8-bit Matrix Multiplication for Trans…		2022-08-15
OPT-IML 175B	OPT-IML: Scaling Language Model Instruction Meta …		2022-12-22
FLAN 137B (8-shot)	Finetuned Language Models Are Zero-Shot Learners		2021-09-03
FLAN 137B (0-shot)	Finetuned Language Models Are Zero-Shot Learners		2021-09-03
OPT-IML 30B	OPT-IML: Scaling Language Model Instruction Meta …		2022-12-22
PaLM 2-M (1-shot)	PaLM 2 Technical Report		2023-05-17
T0-3B (CoT fine-tuned)	The CoT Collection: Improving Zero-shot and Few-s…		2023-05-23
ERNIE 2.0 Large	ERNIE 2.0: A Continual Pre-training Framework for…		2019-07-29
T5-Base 220M	Exploring the Limits of Transfer Learning with a …		2019-10-23
MLM+ del-span	CLEAR: Contrastive Learning for Sentence Represen…		2020-12-31
PaLM 540B (5-shot)	PaLM: Scaling Language Modeling with Pathways		2022-04-05
PaLM 2-L (1-shot)	PaLM 2 Technical Report		2023-05-17
SpanBERT	SpanBERT: Improving Pre-training by Representing …		2019-07-24
Neo-6B (QA + WS)	Ask Me Anything: A simple strategy for prompting …		2022-10-05
BigBird	Big Bird: Transformers for Longer Sequences		2020-07-28
ERNIE 2.0 Base	ERNIE 2.0: A Continual Pre-training Framework for…		2019-07-29
RealFormer	RealFormer: Transformer Likes Residual Attention		2020-12-21
SqueezeBERT	SqueezeBERT: What can computer vision teach NLP a…		2020-06-19
PaLM 540B (0-shot)	PaLM: Scaling Language Modeling with Pathways		2022-04-05
SMART-BERT	SMART: Robust and Efficient Fine-Tuning for Pre-t…		2019-11-08
SMART	SMART: Robust and Efficient Fine-Tuning for Pre-t…		2019-11-08
BERT-large 340M	BERT: Pre-training of Deep Bidirectional Transfor…		2018-10-11
T5-Small	Exploring the Limits of Transfer Learning with a …		2019-10-23
TinyBERT-4 14.5M	TinyBERT: Distilling BERT for Natural Language Un…		2019-09-23
data2vec	data2vec: A General Framework for Self-supervised…		2022-02-07
Bloomberg GPT 50B (1-shot)	BloombergGPT: A Large Language Model for Finance		2023-03-30
FNet-Large	FNet: Mixing Tokens with Fourier Transforms		2021-05-09
GPT-3 175B (few-shot, k=32)	Language Models are Few-Shot Learners		2020-05-28
ERNIE	ERNIE: Enhanced Language Representation with Info…		2019-05-17
AlexaTM 20B	AlexaTM 20B: Few-Shot Learning Using a Large-Scal…		2022-08-02
LaMini-GPT 1.5B	LaMini-LM: A Diverse Herd of Distilled Models fro…		2023-04-27
SenseBERT-base 110M	SenseBERT: Driving Some Sense into BERT		2019-08-15
OPT-IML 1.3B	OPT-IML: Scaling Language Model Instruction Meta …		2022-12-22
TinyBERT-6 67M	TinyBERT: Distilling BERT for Natural Language Un…		2019-09-23
LaMini-F-T5 783M	LaMini-LM: A Diverse Herd of Distilled Models fro…		2023-04-27
DistilBERT 66M	DistilBERT, a distilled version of BERT: smaller,…		2019-10-02
Neo-6B (QA)	Ask Me Anything: A simple strategy for prompting …		2022-10-05
UL2 20B (0-shot)	UL2: Unifying Language Learning Paradigms		2022-05-10
OPT 175B	OPT-IML: Scaling Language Model Instruction Meta …		2022-12-22
N-Grammer 343M	N-Grammer: Augmenting Transformers with latent n-…		2022-07-13
Hybrid H3 125M (0-shot, logit scoring)	Hungry Hungry Hippos: Towards Language Modeling w…		2022-12-28
Neo-6B (few-shot)	Ask Me Anything: A simple strategy for prompting …		2022-10-05
OPT 30B	OPT-IML: Scaling Language Model Instruction Meta …		2022-12-22
Hybrid H3 125M (3-shot, logit scoring)	Hungry Hungry Hippos: Towards Language Modeling w…		2022-12-28
Hybrid H3 125M (3-shot, rank classification)	Hungry Hungry Hippos: Towards Language Modeling w…		2022-12-28
24hBERT	How to Train BERT with an Academic Budget		2021-04-15
BLOOM 176B (1-shot)	BloombergGPT: A Large Language Model for Finance		2023-03-30
LaMini-T5 738M	LaMini-LM: A Diverse Herd of Distilled Models fro…		2023-04-27
OPT 66B (1-shot)	BloombergGPT: A Large Language Model for Finance		2023-03-30
OPT 1.3B	OPT-IML: Scaling Language Model Instruction Meta …		2022-12-22
GPT-NeoX 20B (1-shot)	BloombergGPT: A Large Language Model for Finance		2023-03-30
H3 125M (0-shot, rank classification)	Hungry Hungry Hippos: Towards Language Modeling w…		2022-12-28
GPT-2-XL 1.5B	LaMini-LM: A Diverse Herd of Distilled Models fro…		2023-04-27
H3 125M (3-shot, rank classification)	Hungry Hungry Hippos: Towards Language Modeling w…		2022-12-28

RTE

Performance Over Time

Edit Benchmark Results

Edit Result

Top Performing Models

All Papers (89)