ML Research Wiki / Benchmarks / Common Sense Reasoning / ReCoRD

ReCoRD

Common Sense Reasoning Benchmark

Performance Over Time

📊 Showing 33 results | 📏 Metric: EM

Top Performing Models

Rank Model Paper EM Date Code
1 Turing NLR v5 XXL 5.4B (fine-tuned) Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE 96.40 2022-12-04 -
2 ST-MoE-32B 269B (fine-tuned) ST-MoE: Designing Stable and Transferable Sparse Expert Models 95.10 2022-02-17 📦 tensorflow/mesh 📦 xuefuzhao/openmoe 📦 yikangshen/megablocks
3 PaLM 540B (finetuned) PaLM: Scaling Language Modeling with Pathways 94.60 2022-04-05 📦 lucidrains/CoCa-pytorch 📦 lucidrains/PaLM-pytorch 📦 google/paxml
4 DeBERTa-1.5B DeBERTa: Decoding-enhanced BERT with Disentangled Attention 94.50 2020-06-05 📦 huggingface/transformers 📦 microsoft/DeBERTa 📦 osu-nlp-group/mind2web
5 Vega v2 6B (fine-tuned) Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE 94.40 2022-12-04 -
6 T5-11B Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer 94.10 2019-10-23 📦 huggingface/transformers 📦 PaddlePaddle/PaddleNLP 📦 google-research/text-to-text-transfer-transformer
7 PaLM 2-L (one-shot) PaLM 2 Technical Report 93.80 2023-05-17 📦 eternityyw/tram-benchmark
8 T5-XXL 11B (fine-tuned) Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer 93.40 2019-10-23 📦 huggingface/transformers 📦 PaddlePaddle/PaddleNLP 📦 google-research/text-to-text-transfer-transformer
9 PaLM 2-M (one-shot) PaLM 2 Technical Report 92.40 2023-05-17 📦 eternityyw/tram-benchmark
10 GESA 500M Integrating a Heterogeneous Graph with Entity-aware Self-attention using Relative Position Labels for Reading Comprehension Model 92.20 2023-07-19 -

All Papers (33)

Language Models are Few-Shot Learners

2020
GPT-3 Large 760M (0-shot)

Efficient Language Modeling with Sparse all-MLP

2022
sMLP – deterministic 9.4B (0-shot)