GSM-Plus

Dataset Information
Modalities
Texts
Languages
English
Introduced
2024
License
Unknown
Homepage

Overview

By perturbing the widely used GSM8K dataset, an adversarial dataset for grade-school math called GSM-Plus is created. Motivated by the capability taxonomy for solving math problems mentioned in Polya's principles, this paper identifies 5 perspectives to guide the development of GSM-Plus:

  • Numerical Variation refers to altering the numerical data or its types, including 3 subcategories: Numerical Substitution, Digit Expansion, and Integer-decimal-fraction Conversion.
  • Arithmetic Variation refers to reversing or introducing additional operations (e.g., addition, subtraction, multiplication, and division) to math problems, including 2 subcategories: Adding Operation and Reversing Operation.
  • Problem Understanding refers to rephrasing the text description of the math problems.
  • Distractor Insertion refers to inserting topic-related but useless sentences to the problems.
  • Critical Thinking focuses on question or doubt ability when the question lacks necessary statements.

GSM-Plus can be used to evaluate the robustness of current LLMs in mathematical reasoning.

Variants: GSM-Plus

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Math Word Problem Solving GPT-4 GSM-Plus: A Comprehensive Benchmark for … 2024-02-29

Research Papers

Recent papers with results on this dataset: