CoLA

Corpus of Linguistic Acceptability

Dataset Information
Modalities
Texts
Languages
Spanish
Introduced
2018
License
Homepage

Overview

The Corpus of Linguistic Acceptability (CoLA) consists of 10657 sentences from 23 linguistics publications, expertly annotated for acceptability (grammaticality) by their original authors. The public version contains 9594 sentences belonging to training and development sets, and excludes 1063 sentences belonging to a held out test set.

Source: https://nyu-mll.github.io/CoLA/
Image Source: https://arxiv.org/pdf/1805.12471.pdf

Variants: CoLA, CoLA Dev

Associated Benchmarks

This dataset is used in 3 benchmarks:

Recent Benchmark Submissions

Task Model Paper Date
Linguistic Acceptability ELC-BERT-small 24M Not all layers are equally … 2023-11-03
Linguistic Acceptability LTG-BERT-base 98M Not all layers are equally … 2023-11-03
Linguistic Acceptability LTG-BERT-small 24M Not all layers are equally … 2023-11-03
Linguistic Acceptability ELC-BERT-base 98M Not all layers are equally … 2023-11-03
Linguistic Acceptability LM-CPPF RoBERTa-base LM-CPPF: Paraphrasing-Guided Data Augmentation for … 2023-05-29
Linguistic Acceptability BERT+TDA Can BERT eat RuCoLA? Topological … 2023-04-04
Linguistic Acceptability RoBERTa+TDA Can BERT eat RuCoLA? Topological … 2023-04-04
Linguistic Acceptability deberta-v3-base+tasksource tasksource: A Dataset Harmonization Framework … 2023-01-14
Linguistic Acceptability RemBERT RuCoLA: Russian Corpus of Linguistic … 2022-10-23
Linguistic Acceptability RoBERTa-large 355M (MLP quantized vector-wise, fine-tuned) LLM.int8(): 8-bit Matrix Multiplication for … 2022-08-15
Linguistic Acceptability En-BERT + TDA + PCA Acceptability Judgements via Examining the … 2022-05-19
Linguistic Acceptability En-BERT + TDA Acceptability Judgements via Examining the … 2022-05-19
Linguistic Acceptability data2vec data2vec: A General Framework for … 2022-02-07
Linguistic Acceptability Charformer-Tall Charformer: Fast Character Transformers via … 2021-06-23
Linguistic Acceptability FNet-Large FNet: Mixing Tokens with Fourier … 2021-05-09
Linguistic Acceptability RoBERTa-large 355M + Entailment as Few-shot Learner Entailment as Few-Shot Learner 2021-04-29
Linguistic Acceptability 24hBERT How to Train BERT with … 2021-04-15
Linguistic Acceptability MLM+ del-span+ reorder CLEAR: Contrastive Learning for Sentence … 2020-12-31
Linguistic Acceptability RealFormer RealFormer: Transformer Likes Residual Attention 2020-12-21
Stochastic Optimization Bert Mixing ADAM and SGD: a … 2020-11-16

Research Papers

Recent papers with results on this dataset: