RCB

Russian Commitment Bank

Dataset Information
Modalities
Texts
Languages
Russian
Introduced
2020
License
Homepage

Overview

The Russian Commitment Bank is a corpus of naturally occurring discourses whose final sentence contains a clause-embedding predicate under an entailment cancelling operator (question, modal, negation, antecedent of conditional).

Task Type

RTE (Recognizing Textual Entailment) Sentence Pair Classification - Entailment - Contradiction - Neutral

Example

{
          "premise": "Сумма ущерба составила одну тысячу рублей. Уточняется, что на место происшествия выехала следственная группа, которая установила личность злоумышленника. Им оказался местный житель, ранее судимый за подобное правонарушение.",
          "hypothesis": "Ранее местный житель совершал подобное правонарушение.",
          "verb": "судить",
          "negation": "no_negation",
          "label": "entailment",
          "idx": 269
          }

How did we collect data?

All text examples were collected from open news sources and literary magazines, then manually reviewed and supplemented by a human assessment on Yandex.Toloka.

Variants: RCB

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Natural Language Inference heuristic majority Unreasonable Effectiveness of Rule-Based Heuristics … 2021-05-03
Natural Language Inference Random weighted Unreasonable Effectiveness of Rule-Based Heuristics … 2021-05-03
Natural Language Inference majority_class Unreasonable Effectiveness of Rule-Based Heuristics … 2021-05-03
Natural Language Inference Human Benchmark RussianSuperGLUE: A Russian Language Understanding … 2020-10-29
Natural Language Inference Baseline TF-IDF1.1 RussianSuperGLUE: A Russian Language Understanding … 2020-10-29
Natural Language Inference MT5 Large mT5: A massively multilingual pre-trained … 2020-10-22

Research Papers

Recent papers with results on this dataset: