MATH

Dataset Information
Modalities
Texts
Languages
English
Introduced
2021
License
MIT
Homepage

Overview

MATH is a new dataset of 12,500 challenging competition mathematics problems. Each problem in MATH has a full step-by-step solution which can be used to teach models to generate answer derivations and explanations.

Source: Hendrycks et al.

Image source: Hendrycks et al.

Variants: MATH, MATH minival

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Math Word Problem Solving OpenMath2-Llama3.1-70B (majority@256) OpenMathInstruct-2: Accelerating AI for Math … 2024-10-02
Math Word Problem Solving OpenMath2-Llama3.1-70B OpenMathInstruct-2: Accelerating AI for Math … 2024-10-02
Math Word Problem Solving OpenMath2-Llama3.1-8B OpenMathInstruct-2: Accelerating AI for Math … 2024-10-02
Math Word Problem Solving OpenMath2-Llama3.1-8B (majority@256) OpenMathInstruct-2: Accelerating AI for Math … 2024-10-02
Math Word Problem Solving Qwen2.5-Math-72B-Instruct(COT,Greedy) Qwen2.5-Math Technical Report: Toward Mathematical … 2024-09-18
Math Word Problem Solving Qwen2.5-Math-7B-Instruct(COT,Greedy) Qwen2.5-Math Technical Report: Toward Mathematical … 2024-09-18
Math Word Problem Solving Qwen2.5-Math-72B-Instruct(TIR,Greedy) Qwen2.5-Math Technical Report: Toward Mathematical … 2024-09-18
Math Word Problem Solving Qwen2.5-Math-1.5B-Instruct(TIR,Greedy) Qwen2.5-Math Technical Report: Toward Mathematical … 2024-09-18
Math Word Problem Solving Qwen2.5-Math-7B-Instruct(TIR,Greedy) Qwen2.5-Math Technical Report: Toward Mathematical … 2024-09-18
Math Word Problem Solving Qwen2.5-Math-1.5B-Instruct(COT,Greedy) Qwen2.5-Math Technical Report: Toward Mathematical … 2024-09-18
Math Word Problem Solving Qwen2-Math-72B-Instruct(greedy) Qwen2 Technical Report 2024-07-15
Math Word Problem Solving Qwen2-72B-Instruct-Step-DPO (0-shot CoT, w/o code) Step-DPO: Step-wise Preference Optimization for … 2024-06-26
Math Word Problem Solving DART-Math-Llama3-70B-Uniform (0-shot CoT, w/o code) DART-Math: Difficulty-Aware Rejection Tuning for … 2024-06-18
Math Word Problem Solving DART-Math-DSMath-7B-Prop2Diff (0-shot CoT, w/o code) DART-Math: Difficulty-Aware Rejection Tuning for … 2024-06-18
Math Word Problem Solving DART-Math-Mistral-7B-Prop2Diff (0-shot CoT, w/o code) DART-Math: Difficulty-Aware Rejection Tuning for … 2024-06-18
Math Word Problem Solving DART-Math-Llama3-70B-Prop2Diff (0-shot CoT, w/o code) DART-Math: Difficulty-Aware Rejection Tuning for … 2024-06-18
Math Word Problem Solving DART-Math-DSMath-7B-Uniform (0-shot CoT, w/o code) DART-Math: Difficulty-Aware Rejection Tuning for … 2024-06-18
Math Word Problem Solving DART-Math-Llama3-8B-Prop2Diff (0-shot CoT, w/o code) DART-Math: Difficulty-Aware Rejection Tuning for … 2024-06-18
Math Word Problem Solving DART-Math-Llama3-8B-Uniform (0-shot CoT, w/o code) DART-Math: Difficulty-Aware Rejection Tuning for … 2024-06-18
Math Word Problem Solving DART-Math-Mistral-7B-Uniform (0-shot CoT, w/o code) DART-Math: Difficulty-Aware Rejection Tuning for … 2024-06-18

Research Papers

Recent papers with results on this dataset: