TERRa

Textual Entailment Recognition for Russian

Dataset Information
Modalities
Texts
Languages
Russian
Introduced
2020
License
Homepage

Overview

Textual Entailment Recognition has been proposed recently as a generic task that captures major semantic inference needs across many NLP applications, such as Question Answering, Information Retrieval, Information Extraction, and Text Summarization. This task requires to recognize, given two text fragments, whether the meaning of one text is entailed (can be inferred) from the other text.

Task Type

RTE (Recognizing Textual Entailment) Sentence Pair Classification - Entailment - Not Entailment

Example

{
  "premise": "Автор поста написал в комментарии, что прорвалась канализация.",
  "hypothesis": "Автор поста написал про канализацию.",
  "label": "entailment",
  "idx": "6062"
}

How did we collect data?

All text examples were collected from open news sources and literary magazines, then manually reviewed and supplemented by a human assessment on Yandex.Toloka

Variants: TERRa

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Natural Language Inference heuristic majority Unreasonable Effectiveness of Rule-Based Heuristics … 2021-05-03
Natural Language Inference majority_class Unreasonable Effectiveness of Rule-Based Heuristics … 2021-05-03
Natural Language Inference Random weighted Unreasonable Effectiveness of Rule-Based Heuristics … 2021-05-03
Natural Language Inference Human Benchmark RussianSuperGLUE: A Russian Language Understanding … 2020-10-29
Natural Language Inference Baseline TF-IDF1.1 RussianSuperGLUE: A Russian Language Understanding … 2020-10-29
Natural Language Inference MT5 Large mT5: A massively multilingual pre-trained … 2020-10-22

Research Papers

Recent papers with results on this dataset: