GPT-4-Turbo
|
EconLogicQA: A Question-Answering Benchmark for E…
|
0.57
|
2024-05-13
|
|
GPT-4
|
EconLogicQA: A Question-Answering Benchmark for E…
|
0.55
|
2024-05-13
|
|
GPT-3.5-Turbo
|
EconLogicQA: A Question-Answering Benchmark for E…
|
0.38
|
2024-05-13
|
|
Llama-3-8B-Instruct
|
EconLogicQA: A Question-Answering Benchmark for E…
|
0.35
|
2024-05-13
|
|
Mistral-7B-Instruct-v0.2
|
EconLogicQA: A Question-Answering Benchmark for E…
|
0.32
|
2024-05-13
|
|
Mistral-7B-v0.1
|
EconLogicQA: A Question-Answering Benchmark for E…
|
0.26
|
2024-05-13
|
|
Mistral-7B-v0.2
|
EconLogicQA: A Question-Answering Benchmark for E…
|
0.26
|
2024-05-13
|
|
Llama-3-8B
|
EconLogicQA: A Question-Answering Benchmark for E…
|
0.24
|
2024-05-13
|
|
Zephyr-7B-Alpha
|
EconLogicQA: A Question-Answering Benchmark for E…
|
0.23
|
2024-05-13
|
|
Yi-6B-Chat
|
EconLogicQA: A Question-Answering Benchmark for E…
|
0.21
|
2024-05-13
|
|
Zephyr-7B-Beta
|
EconLogicQA: A Question-Answering Benchmark for E…
|
0.18
|
2024-05-13
|
|
Mistral-7B-Instruct-v0.1
|
EconLogicQA: A Question-Answering Benchmark for E…
|
0.15
|
2024-05-13
|
|
Llama-2-13B-Chat
|
EconLogicQA: A Question-Answering Benchmark for E…
|
0.15
|
2024-05-13
|
|
Llama-2-7B-Chat
|
EconLogicQA: A Question-Answering Benchmark for E…
|
0.09
|
2024-05-13
|
|
Gemma-2B-IT
|
EconLogicQA: A Question-Answering Benchmark for E…
|
0.08
|
2024-05-13
|
|
Yi-6B
|
EconLogicQA: A Question-Answering Benchmark for E…
|
0.04
|
2024-05-13
|
|
Gemma-7B-IT
|
EconLogicQA: A Question-Answering Benchmark for E…
|
0.02
|
2024-05-13
|
|
Llama-2-7B
|
EconLogicQA: A Question-Answering Benchmark for E…
|
0.01
|
2024-05-13
|
|