ML Research Wiki / Benchmarks / Sentence Ordering / EconLogicQA

EconLogicQA

Sentence Ordering Benchmark

Performance Over Time

📊 Showing 18 results | 📏 Metric: Accuracy

Top Performing Models

Rank Model Paper Accuracy Date Code
1 GPT-4-Turbo EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning 0.57 2024-05-13 📦 yinzhu-quan/lm-evaluation-harness
2 GPT-4 EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning 0.55 2024-05-13 📦 yinzhu-quan/lm-evaluation-harness
3 GPT-3.5-Turbo EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning 0.38 2024-05-13 📦 yinzhu-quan/lm-evaluation-harness
4 Llama-3-8B-Instruct EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning 0.35 2024-05-13 📦 yinzhu-quan/lm-evaluation-harness
5 Mistral-7B-Instruct-v0.2 EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning 0.32 2024-05-13 📦 yinzhu-quan/lm-evaluation-harness
6 Mistral-7B-v0.1 EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning 0.26 2024-05-13 📦 yinzhu-quan/lm-evaluation-harness
7 Mistral-7B-v0.2 EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning 0.26 2024-05-13 📦 yinzhu-quan/lm-evaluation-harness
8 Llama-3-8B EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning 0.24 2024-05-13 📦 yinzhu-quan/lm-evaluation-harness
9 Zephyr-7B-Alpha EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning 0.23 2024-05-13 📦 yinzhu-quan/lm-evaluation-harness
10 Yi-6B-Chat EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning 0.21 2024-05-13 📦 yinzhu-quan/lm-evaluation-harness

All Papers (18)