ML Research Wiki / Benchmarks / Natural Language Inference / CommitmentBank

CommitmentBank

Natural Language Inference Benchmark

Performance Over Time

📊 Showing 20 results | 📏 Metric: Accuracy

Top Performing Models

Rank Model Paper Accuracy Date Code
1 PaLM 540B (finetuned) PaLM: Scaling Language Modeling with Pathways 100.00 2022-04-05 📦 lucidrains/CoCa-pytorch 📦 lucidrains/PaLM-pytorch 📦 google/paxml
2 Vega v2 6B (KD-based prompt transfer) Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE 99.20 2022-12-04 -
3 ST-MoE-L 4.1B (fine-tuned) ST-MoE: Designing Stable and Transferable Sparse Expert Models 98.20 2022-02-17 📦 tensorflow/mesh 📦 xuefuzhao/openmoe 📦 yikangshen/megablocks
4 ST-MoE-32B 269B (fine-tuned) ST-MoE: Designing Stable and Transferable Sparse Expert Models 98.00 2022-02-17 📦 tensorflow/mesh 📦 xuefuzhao/openmoe 📦 yikangshen/megablocks
5 Turing NLR v5 XXL 5.4B (fine-tuned) Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE 97.60 2022-12-04 -
6 DeBERTa-1.5B DeBERTa: Decoding-enhanced BERT with Disentangled Attention 97.20 2020-06-05 📦 huggingface/transformers 📦 microsoft/DeBERTa 📦 osu-nlp-group/mind2web
7 T5-XXL 11B (fine-tuned) Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer 96.80 2019-10-23 📦 huggingface/transformers 📦 PaddlePaddle/PaddleNLP 📦 google-research/text-to-text-transfer-transformer
8 T5-Large 770M (fine-tuned) Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer 94.40 2019-10-23 📦 huggingface/transformers 📦 PaddlePaddle/PaddleNLP 📦 google-research/text-to-text-transfer-transformer
9 T5-Base 220M (fine-tuned) Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer 94.00 2019-10-23 📦 huggingface/transformers 📦 PaddlePaddle/PaddleNLP 📦 google-research/text-to-text-transfer-transformer
10 PaLM 2-L (one-shot) PaLM 2 Technical Report 87.50 2023-05-17 📦 eternityyw/tram-benchmark

All Papers (20)

Language Models are Few-Shot Learners

2020
GPT-3 175B (Few-Shot)

Language Models are Few-Shot Learners

2020
GPT-3 175B (few-shot, k=32)