ML Research Wiki / Benchmarks / Question Answering / COPA

COPA

Question Answering Benchmark

Performance Over Time

📊 Showing 55 results | 📏 Metric: Accuracy

Top Performing Models

Rank Model Paper Accuracy Date Code
1 PaLM 540B (finetuned) PaLM: Scaling Language Modeling with Pathways 100.00 2022-04-05 📦 lucidrains/CoCa-pytorch 📦 lucidrains/PaLM-pytorch 📦 google/paxml
2 Vega v2 6B (KD-based prompt transfer) Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE 99.40 2022-12-04 -
3 ST-MoE-32B 269B (fine-tuned) ST-MoE: Designing Stable and Transferable Sparse Expert Models 99.20 2022-02-17 📦 tensorflow/mesh 📦 xuefuzhao/openmoe 📦 yikangshen/megablocks
4 UL2 20B (fine-tuned) UL2: Unifying Language Learning Paradigms 99.00 2022-05-10 📦 google-research/google-research 📦 opennlg/openba-v2
5 DeBERTa-Ensemble DeBERTa: Decoding-enhanced BERT with Disentangled Attention 98.40 2020-06-05 📦 huggingface/transformers 📦 microsoft/DeBERTa 📦 osu-nlp-group/mind2web
6 Turing NLR v5 XXL 5.4B (fine-tuned) Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE 98.20 2022-12-04 -
7 DeBERTa-1.5B DeBERTa: Decoding-enhanced BERT with Disentangled Attention 96.80 2020-06-05 📦 huggingface/transformers 📦 microsoft/DeBERTa 📦 osu-nlp-group/mind2web
8 PaLM 2-L (1-shot) PaLM 2 Technical Report 96.00 2023-05-17 📦 eternityyw/tram-benchmark
9 T5-XXL 11B (fine-tuned) Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer 94.80 2019-10-23 📦 huggingface/transformers 📦 PaddlePaddle/PaddleNLP 📦 google-research/text-to-text-transfer-transformer
10 FLAN 137B (prompt-tuned) Finetuned Language Models Are Zero-Shot Learners 94.00 2021-09-03 📦 hiyouga/llama-efficient-tuning 📦 bigcode-project/starcoder 📦 bigscience-workshop/promptsource

All Papers (55)

Language Models are Few-Shot Learners

2020
GPT-3 175B (few-shot, k=32)

Language Models are Few-Shot Learners

2020
GPT-3 13B (few-shot, k=32)

Efficient Language Modeling with Sparse all-MLP

2022
sMLP – deterministic 9.4B (0-shot)

Language Models are Few-Shot Learners

2020
GPT-3 Large 760M (0-shot)