ML Research Wiki / Benchmarks / Question Answering / PIQA

PIQA

Question Answering Benchmark

Performance Over Time

📊 Showing 67 results | 📏 Metric: Accuracy

Top Performing Models

Rank Model Paper Accuracy Date Code
1 Unicorn 11B (fine-tuned) UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New Multitask Benchmark 90.10 2021-03-24 📦 allenai/rainbow
2 LLaMA3 8B+MoSLoRA Mixture-of-Subspaces in Low-Rank Adaptation 89.70 2024-06-16 📦 wutaiqiang/moslora
3 CompassMTL 567M with Tailor Task Compass: Scaling Multi-task Pre-training with Task Prefix 88.30 2022-10-12 📦 cooelf/compassmtl
4 LLaMA-3 8B + MixLoRA MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts 87.60 2024-04-22 📦 TUDB-Labs/MixLoRA 📦 mikecovlee/mLoRA
5 DeBERTa-Large 304M Two is Better than Many? Binary Classification as an Effective Approach to Multi-Choice Question Answering 87.40 2022-10-29 📦 declare-lab/team
6 CompassMTL 567M Task Compass: Scaling Multi-task Pre-training with Task Prefix 87.30 2022-10-12 📦 cooelf/compassmtl
7 LLaMA-2 13B + MixLoRA MixLoRA: Enhancing Large Language Models Fine-Tuning with LoRA-based Mixture of Experts 86.80 2024-04-22 📦 TUDB-Labs/MixLoRA 📦 mikecovlee/mLoRA
8 Shakti-LLM (2.5B) SHAKTI: A 2.5 Billion Parameter Small Language Model Optimized for Edge AI and Low-Resource Environments 86.20 2024-10-15 -
9 DeBERTa-Large 304M (classification-based) Two is Better than Many? Binary Classification as an Effective Approach to Multi-Choice Question Answering 85.90 2022-10-29 📦 declare-lab/team
10 ExDeBERTa 567M Task Compass: Scaling Multi-task Pre-training with Task Prefix 85.50 2022-10-12 📦 cooelf/compassmtl

All Papers (67)

Efficient Language Modeling with Sparse all-MLP

2022
sMLP - deterministic 9.4B (0-shot)

Language Models are Few-Shot Learners

2020
GPT-3 Large 760M (0-shot)