RuCoLA

Dataset Information
Modalities
Texts
Languages
Russian
Introduced
2022
License
Unknown
Homepage

Overview

The Russian Corpus of Linguistic Acceptability (RuCoLA) is built from the ground up under the well-established binary LA approach. RuCoLA consists of 9.8k in-domain sentences from linguistic publications and 3.6k out-of-domain sentence produced by generative models.

Source: RuCoLA: Russian Corpus of Linguistic Acceptability

Image Source: https://arxiv.org/pdf/2210.12814v1.pdf

Variants: RuCoLA

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Linguistic Acceptability Ru-BERT+TDA Can BERT eat RuCoLA? Topological … 2023-04-04
Linguistic Acceptability Ru-RoBERTa+TDA Can BERT eat RuCoLA? Topological … 2023-04-04
Linguistic Acceptability RemBERT RuCoLA: Russian Corpus of Linguistic … 2022-10-23
Linguistic Acceptability ruBERT RuCoLA: Russian Corpus of Linguistic … 2022-10-23
Linguistic Acceptability ruGPT-3 RuCoLA: Russian Corpus of Linguistic … 2022-10-23
Linguistic Acceptability ruT5 RuCoLA: Russian Corpus of Linguistic … 2022-10-23
Linguistic Acceptability mBERT RuCoLA: Russian Corpus of Linguistic … 2022-10-23
Linguistic Acceptability XLM-R RuCoLA: Russian Corpus of Linguistic … 2022-10-23
Linguistic Acceptability ruRoBERTa RuCoLA: Russian Corpus of Linguistic … 2022-10-23

Research Papers

Recent papers with results on this dataset: