ParaMAWPS

Name: ParaMAWPS
Published: 2023-06-24
License: Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International

Paraphrased Math Word Problem Solving Repository

Dataset Information

Modalities

Texts

Languages

English

Introduced

2023

License

Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

This repository contains the code, data, and models of the paper titled "Math Word Problem Solving by Generating Linguistic Variants of Problem Statements" published in the Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop).

The work is outlined in a more detailed and expository manner in our Bachelor of Science (B.Sc.) thesis titled "Variational Mathematical Reasoning: Enhancing Math Word Problem Solvers with Linguistic Variants and Disentangled Attention" which can be accessed from the Islamic University of Technology (IUT) Institutional Repository.

License: Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International

Dataset

In order to download our dataset PᴀʀᴀMAWPS, please navigate to the ParaMAWPS folder. We use an $80:10:10$ train-validation-test split for our PᴀʀᴀMAWPS dataset. The splits are available in .json format in the aforementioned folder.

Data Format

Each row consists of a Math Word Problem (MWP). The table below describes what each column signifies.

Column Title	Description
`id`	The unique identification number of the sample. Seed problems have `id` size of $\leq 4$, whereas, variant problems have `id` size of $> 4$. The last variant of a seed problem (generally with the `id` "$16000i$", where $i$ is the `id` of the seed problem) is the inverse variant of the seed problem.
`original_text`	The problem statement of the MWP. The seed problems have the same problem statement as present in the Mᴀᴡᴘs dataset.
`equation`	The equation with a variable $x$ which solves the MWP
`quantity_tagged_text`	The problem statement of the MWP, where each quantity is replaced with a unique tag $[Q_i]$
`quantity_tagged_equation`	The equation with a variable $x$ which solves the MWP, but each quantity is replaced with its unique tag $[Q_i]$ in the problem statement
`have_constant`	Whether the use of a constant value is required to solve the MWP For an MWP sample $i$ with `have_constant` label $C_i$, the boolean label is, $C_i =\begin{cases} \text{FALSE}, & \text{if $i$ requires $0$ constant values}\\ \text{TRUE}, & \text{if $i$ requires $\geq 1$ constant values}\end{cases}$

Types of Variations

drawing

Dataset Statistics

drawing
drawing

Methodology

drawing

Results

To reproduce the results, please refer to the documentation of MWPToolkit created by Yihuai Lan et al.
drawing
drawing

Citation

If you find this work useful, please cite our paper:

@inproceedings{raiyan-etal-2023-math,
    title = "Math Word Problem Solving by Generating Linguistic Variants of Problem Statements",
    author = "Raiyan, Syed Rifat  and
      Faiyaz, Md Nafis  and
      Kabir, Shah Md. Jawad  and
      Kabir, Mohsinul  and
      Mahmud, Hasan  and
      Hasan, Md Kamrul",
    booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 4: Student Research Workshop)",
    month = jul,
    year = "2023",
    address = "Toronto, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.acl-srw.49",
    doi = "10.18653/v1/2023.acl-srw.49",
    pages = "362--378",
    abstract = "The art of mathematical reasoning stands as a fundamental pillar of intellectual progress and is a central catalyst in cultivating human ingenuity. Researchers have recently published a plethora of works centered around the task of solving Math Word Problems (MWP) {---} a crucial stride towards general AI. These existing models are susceptible to dependency on shallow heuristics and spurious correlations to derive the solution expressions. In order to ameliorate this issue, in this paper, we propose a framework for MWP solvers based on the generation of linguistic variants of the problem text. The approach involves solving each of the variant problems and electing the predicted expression with the majority of the votes. We use DeBERTa (Decoding-enhanced BERT with disentangled attention) as the encoder to leverage its rich textual representations and enhanced mask decoder to construct the solution expressions. Furthermore, we introduce a challenging dataset, ParaMAWPS, consisting of paraphrased, adversarial, and inverse variants of selectively sampled MWPs from the benchmark Mawps dataset. We extensively experiment on this dataset along with other benchmark datasets using some baseline MWP solver models. We show that training on linguistic variants of problem statements and voting on candidate predictions improve the mathematical reasoning and robustness of the model. We make our code and data publicly available.",
}

You can also cite our thesis:

@phdthesis{raiyan2023variational,
  type={Bachelor's Thesis},
  title={Variational Mathematical Reasoning: Enhancing Math Word Problem Solvers with Linguistic Variants and Disentangled Attention},
  author={Raiyan, Syed Rifat and Faiyaz, Md Nafis and Kabir, Shah Md Jawad},
  year={2023},
  school={Department of Computer Science and Engineering (CSE), Islamic University of Technology},
  address={Board Bazar, Gazipur-1704, Dhaka, Bangladesh},
  note={Available at \url{http://103.82.172.44:8080/xmlui/handle/123456789/2092}}
}

Variants: ParaMAWPS

Associated Benchmarks

This dataset is used in 1 benchmark:

Math Word Problem Solving - Metrics: Accuracy (%)

Recent Benchmark Submissions

Task	Model	Paper	Date
Math Word Problem Solving	DeBERTa (VM)	Math Word Problem Solving by …	2023-06-24
Math Word Problem Solving	GPT-3.5 Turbo (175B)	Math Word Problem Solving by …	2023-06-24
Math Word Problem Solving	GPT-J (6B)	Math Word Problem Solving by …	2023-06-24
Math Word Problem Solving	GPT-3 text-curie-001 (13B)	Math Word Problem Solving by …	2023-06-24
Math Word Problem Solving	GPT-3 text-babbage-001 (6.7B)	Math Word Problem Solving by …	2023-06-24
Math Word Problem Solving	DeBERTa	DeBERTa: Decoding-enhanced BERT with Disentangled …	2020-06-05

Research Papers

Recent papers with results on this dataset:

External Links: