ML Research Wiki / Benchmarks / Question Answering / MultiRC

MultiRC

Question Answering Benchmark

Performance Over Time

📊 Showing 30 results | 📏 Metric: F1

Top Performing Models

Rank Model Paper F1 Date Code
1 PaLM 540B (finetuned) PaLM: Scaling Language Modeling with Pathways 90.10 2022-04-05 📦 lucidrains/CoCa-pytorch 📦 lucidrains/PaLM-pytorch 📦 google/paxml
2 ST-MoE-32B 269B (fine-tuned) ST-MoE: Designing Stable and Transferable Sparse Expert Models 89.60 2022-02-17 📦 tensorflow/mesh 📦 xuefuzhao/openmoe 📦 yikangshen/megablocks
3 Turing NLR v5 XXL 5.4B (fine-tuned) Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE 88.40 2022-12-04 -
4 DeBERTa-1.5B DeBERTa: Decoding-enhanced BERT with Disentangled Attention 88.20 2020-06-05 📦 huggingface/transformers 📦 microsoft/DeBERTa 📦 osu-nlp-group/mind2web
5 Vega v2 6B (fine-tuned) Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE 88.20 2022-12-04 -
6 PaLM 2-L (one-shot) PaLM 2 Technical Report 88.20 2023-05-17 📦 eternityyw/tram-benchmark
7 T5-XXL 11B (fine-tuned) Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer 88.10 2019-10-23 📦 huggingface/transformers 📦 PaddlePaddle/PaddleNLP 📦 google-research/text-to-text-transfer-transformer
8 ST-MoE-L 4.1B (fine-tuned) ST-MoE: Designing Stable and Transferable Sparse Expert Models 86.00 2022-02-17 📦 tensorflow/mesh 📦 xuefuzhao/openmoe 📦 yikangshen/megablocks
9 PaLM 2-M (one-shot) PaLM 2 Technical Report 84.10 2023-05-17 📦 eternityyw/tram-benchmark
10 PaLM 2-S (one-shot) PaLM 2 Technical Report 84.00 2023-05-17 📦 eternityyw/tram-benchmark

All Papers (30)

Language Models are Few-Shot Learners

2020
GPT-3 175B (Few-Shot)