MATH-V

Dataset Information

Modalities

Images, Texts

Languages

English

Introduced

2024

License

mit

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

Math-Vision (Math-V) dataset is a meticulously curated collection of 3,040 high-quality mathematical problems with visual contexts sourced from real math competitions. Spanning 16 distinct mathematical disciplines and graded across 5 levels of difficulty, our dataset provides a comprehensive and diverse set of challenges for evaluating the mathematical reasoning abilities of LMMs.

Through extensive experimentation, we unveil a notable performance gap between current LMMs and human performance on Math-Vision, underscoring the imperative for further advancements in LMMs. Moreover, our detailed categorization allows for a thorough error analysis of LMMs, offering valuable insights to guide future research and development.

Variants: MATH-V

Associated Benchmarks

This dataset is used in 1 benchmark:

Multimodal Reasoning - Metrics: Accuracy

Recent Benchmark Submissions

Task	Model	Paper	Date
Multimodal Reasoning	GPT4V	Measuring Multimodal Mathematical Reasoning with …	2024-02-22
Multimodal Reasoning	Gemini Pro	Measuring Multimodal Mathematical Reasoning with …	2024-02-22
Multimodal Reasoning	Qwen-VL-Max	Measuring Multimodal Mathematical Reasoning with …	2024-02-22
Multimodal Reasoning	InternLM-XComposer2-VL	Measuring Multimodal Mathematical Reasoning with …	2024-02-22

Research Papers

Recent papers with results on this dataset:

Measuring Multimodal Mathematical Reasoning with MATH-Vision Dataset (2024) -

External Links:

MATH-V

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview