FarsTail

Dataset Information
Modalities
Texts
Languages
Persian
Introduced
2020
Homepage

Overview

Natural Language Inference (NLI), also called Textual Entailment, is an important task in NLP with the goal of determining the inference relationship between a premise p and a hypothesis h. It is a three-class problem, where each pair (p, h) is assigned to one of these classes: "ENTAILMENT" if the hypothesis can be inferred from the premise, "CONTRADICTION" if the hypothesis contradicts the premise, and "NEUTRAL" if none of the above holds. There are large datasets such as SNLI, MNLI, and SciTail for NLI in English, but there are few datasets for poor-data languages like Persian. Persian (Farsi) language is a pluricentric language spoken by around 110 million people in countries like Iran, Afghanistan, and Tajikistan. FarsTail is the first relatively large-scale Persian dataset for NLI task. A total of 10,367 samples are generated from a collection of 3,539 multiple-choice questions. The train, validation, and test portions include 7,266, 1,537, and 1,564 instances, respectively.

Source: https://github.com/dml-qom/FarsTail
Image Source: https://github.com/dml-qom/FarsTail

Variants: FarsTail

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Natural Language Inference mBERT FarsTail: A Persian Natural Language … 2020-09-18
Natural Language Inference ParsBERT FarsTail: A Persian Natural Language … 2020-09-18
Natural Language Inference Translate-Source + fastText FarsTail: A Persian Natural Language … 2020-09-18
Natural Language Inference LSTM + BERT (concat) FarsTail: A Persian Natural Language … 2020-09-18
Natural Language Inference ESIM + BERT (FarsTail, MultiNLI) FarsTail: A Persian Natural Language … 2020-09-18
Natural Language Inference ULMFiT FarsTail: A Persian Natural Language … 2020-09-18
Natural Language Inference ESIM + fastText FarsTail: A Persian Natural Language … 2020-09-18
Natural Language Inference Translate-Target + fastText FarsTail: A Persian Natural Language … 2020-09-18
Natural Language Inference Decomposable Attention Model + word2vec FarsTail: A Persian Natural Language … 2020-09-18
Natural Language Inference HBMP + word2vec FarsTail: A Persian Natural Language … 2020-09-18

Research Papers

Recent papers with results on this dataset: