GPT-4 DUP
|
Achieving >97% on GSM8K: Deeply Understanding the…
|
94.20
|
2024-04-23
|
|
GPT-4 (Teaching-Inspired)
|
Teaching-Inspired Integrated Prompting Framework:…
|
93.90
|
2024-10-10
|
|
GPT-4 (Model Selection)
|
Automatic Model Selection with Large Language Mod…
|
93.70
|
2023-05-23
|
|
GPT-4 (PHP)
|
Progressive-Hint Prompting Improves Reasoning in …
|
91.90
|
2023-04-19
|
|
OpenMath-CodeLlama-70B (w/ code)
|
OpenMathInstruct-1: A 1.8 Million Math Instructio…
|
87.80
|
2024-02-15
|
|
MathCoder-L-70B
|
MathCoder: Seamless Code Integration in LLMs for …
|
84.90
|
2023-10-05
|
|
MMOS-CODE-34B(0-shot)
|
An Empirical Study of Data Ability Boundary in LL…
|
80.60
|
2024-02-23
|
|
MMOS-DeepSeekMath-7B(0-shot)
|
An Empirical Study of Data Ability Boundary in LL…
|
79.30
|
2024-02-23
|
|
MMOS-CODE-7B(0-shot)
|
An Empirical Study of Data Ability Boundary in LL…
|
76.40
|
2024-02-23
|
|
LLaMA 2-Chat
|
Llama 2: Open Foundation and Fine-Tuned Chat Mode…
|
69.20
|
2023-07-18
|
|
DeBERTa
|
Math Word Problem Solving by Generating Linguisti…
|
63.50
|
2023-06-24
|
|
PaLM (zero-shot, CoT)
|
Large Language Models are Zero-Shot Reasoners
|
62.10
|
2022-05-24
|
|
PaLM (zero-shot)
|
Large Language Models are Zero-Shot Reasoners
|
58.80
|
2022-05-24
|
|
SYRELM (Vicuna 13B)
|
Frugal LMs Trained to Invoke Symbolic Solvers Ach…
|
56.65
|
2023-12-09
|
|
ATHENA (roberta-large)
|
ATHENA: Mathematical Reasoning with Thought Expan…
|
54.80
|
2023-11-02
|
|
MsAT-DeductReasoner
|
Learning Multi-Step Reasoning by Solving Arithmet…
|
48.90
|
2023-06-02
|
|
Roberta-DeductReasoner
|
Learning to Reason Deductively: Math Word Problem…
|
47.30
|
2022-03-19
|
|
ATHENA (roberta-base)
|
ATHENA: Mathematical Reasoning with Thought Expan…
|
45.60
|
2023-11-02
|
|
Graph2Tree with RoBERTa
|
Are NLP Models really able to Solve Simple Math W…
|
43.80
|
2021-03-12
|
|
GTS with RoBERTa
|
Are NLP Models really able to Solve Simple Math W…
|
41.00
|
2021-03-12
|
|
LSTM Seq2Seq with RoBERTa
|
Are NLP Models really able to Solve Simple Math W…
|
40.30
|
2021-03-12
|
|
SYRELM (GPT-J)
|
Frugal LMs Trained to Invoke Symbolic Solvers Ach…
|
40.10
|
2023-12-09
|
|
Transformer with RoBERTa
|
Are NLP Models really able to Solve Simple Math W…
|
38.90
|
2021-03-12
|
|