GSM8K
|
Xolver |
Xolver: Multi-Agent Reasoning with Holistic …
|
2025-06-17 |
Arithmetic Reasoning
|
Qwen2.5-32B + CAPO |
CAPO: Cost-Aware Prompt Optimization
|
2025-04-22 |
Arithmetic Reasoning
|
Mistral-Small-24B + CAPO |
CAPO: Cost-Aware Prompt Optimization
|
2025-04-22 |
Arithmetic Reasoning
|
Llama-3.3-70B + CAPO |
CAPO: Cost-Aware Prompt Optimization
|
2025-04-22 |
GSM8K
|
Orange-mini |
MyGO Multiplex CoT: A Method …
|
2025-01-20 |
Arithmetic Reasoning
|
GPT-4 (Teaching-Inspired) |
Teaching-Inspired Integrated Prompting Framework: A …
|
2024-10-10 |
Arithmetic Reasoning
|
OpenMath2-Llama3.1-8B (majority@256) |
OpenMathInstruct-2: Accelerating AI for Math …
|
2024-10-02 |
Arithmetic Reasoning
|
OpenMath2-Llama3.1-8B |
OpenMathInstruct-2: Accelerating AI for Math …
|
2024-10-02 |
Arithmetic Reasoning
|
OpenMath2-Llama3.1-70B |
OpenMathInstruct-2: Accelerating AI for Math …
|
2024-10-02 |
Arithmetic Reasoning
|
OpenMath2-Llama3.1-70B (majority@256) |
OpenMathInstruct-2: Accelerating AI for Math …
|
2024-10-02 |
Arithmetic Reasoning
|
Qwen2-Math-72B-Instruct
(greedy) |
Qwen2 Technical Report
|
2024-07-15 |
Arithmetic Reasoning
|
Qwen2-72B-Instruct-Step-DPO (0-shot CoT) |
Step-DPO: Step-wise Preference Optimization for …
|
2024-06-26 |
Arithmetic Reasoning
|
DART-Math-Llama3-70B-Uniform (0-shot CoT, w/o code) |
DART-Math: Difficulty-Aware Rejection Tuning for …
|
2024-06-18 |
Arithmetic Reasoning
|
DART-Math-Mistral-7B-Uniform (0-shot CoT, w/o code) |
DART-Math: Difficulty-Aware Rejection Tuning for …
|
2024-06-18 |
Arithmetic Reasoning
|
GaC(Qwen2-72B-Instruct + Llama-3-70B-Instruct) |
Breaking the Ceiling of the …
|
2024-06-18 |
Arithmetic Reasoning
|
Claude 3.5 Sonnet (HPT) |
Hierarchical Prompting Taxonomy: A Universal …
|
2024-06-18 |
Arithmetic Reasoning
|
DART-Math-Llama3-8B-Uniform (0-shot CoT, w/o code) |
DART-Math: Difficulty-Aware Rejection Tuning for …
|
2024-06-18 |
Arithmetic Reasoning
|
DART-Math-Mistral-7B-Prop2Diff (0-shot CoT, w/o code) |
DART-Math: Difficulty-Aware Rejection Tuning for …
|
2024-06-18 |
Arithmetic Reasoning
|
DART-Math-DSMath-7B-Uniform (0-shot CoT, w/o code) |
DART-Math: Difficulty-Aware Rejection Tuning for …
|
2024-06-18 |
Arithmetic Reasoning
|
DART-Math-DSMath-7B-Prop2Diff (0-shot CoT, w/o code) |
DART-Math: Difficulty-Aware Rejection Tuning for …
|
2024-06-18 |