MATH-V

Dataset Information
Modalities
Images, Texts
Languages
English
Introduced
2024
License
mit
Homepage

Overview

Math-Vision (Math-V) dataset is a meticulously curated collection of 3,040 high-quality mathematical problems with visual contexts sourced from real math competitions. Spanning 16 distinct mathematical disciplines and graded across 5 levels of difficulty, our dataset provides a comprehensive and diverse set of challenges for evaluating the mathematical reasoning abilities of LMMs.

Through extensive experimentation, we unveil a notable performance gap between current LMMs and human performance on Math-Vision, underscoring the imperative for further advancements in LMMs. Moreover, our detailed categorization allows for a thorough error analysis of LMMs, offering valuable insights to guide future research and development.

Variants: MATH-V

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Multimodal Reasoning GPT4V Measuring Multimodal Mathematical Reasoning with … 2024-02-22
Multimodal Reasoning Gemini Pro Measuring Multimodal Mathematical Reasoning with … 2024-02-22
Multimodal Reasoning Qwen-VL-Max Measuring Multimodal Mathematical Reasoning with … 2024-02-22
Multimodal Reasoning InternLM-XComposer2-VL Measuring Multimodal Mathematical Reasoning with … 2024-02-22

Research Papers

Recent papers with results on this dataset: