ML Research Wiki / Benchmarks / Question Answering / DROP

DROP

Question Answering Benchmark

Performance Over Time

📊 Showing 6 results | 📏 Metric: Accuracy

Top Performing Models

Rank Model Paper Accuracy Date Code
1 PaLM 540B (Self Improvement, Self Consistency) Large Language Models Can Self-Improve 83.00 2022-10-20 -
2 PaLM 540B (Self Consistency) Large Language Models Can Self-Improve 78.20 2022-10-20 -
3 PaLM 540B (Self Improvement, CoT Prompting) Large Language Models Can Self-Improve 76.20 2022-10-20 -
4 PaLM 540B (Self Improvement, Standard-Prompting) Large Language Models Can Self-Improve 71.70 2022-10-20 -
5 PaLM 540B (CoT Prompting) Large Language Models Can Self-Improve 70.60 2022-10-20 -
6 PaLM 540B (Standard-Prompting) Large Language Models Can Self-Improve 60.00 2022-10-20 -

All Papers (6)

Large Language Models Can Self-Improve

2022
PaLM 540B (Self Improvement, Self Consistency)

Large Language Models Can Self-Improve

2022
PaLM 540B (Self Consistency)

Large Language Models Can Self-Improve

2022
PaLM 540B (Self Improvement, CoT Prompting)

Large Language Models Can Self-Improve

2022
PaLM 540B (Self Improvement, Standard-Prompting)

Large Language Models Can Self-Improve

2022
PaLM 540B (CoT Prompting)

Large Language Models Can Self-Improve

2022
PaLM 540B (Standard-Prompting)