JamPatoisNLI

Dataset Information
Languages
Jamaican Creole English
License
Unknown

Overview

JamPatoisNLI provides the first dataset for natural language inference in a creole language,
Jamaican Patois. Many of the most-spoken
low-resource languages are creoles. These
languages commonly have a lexicon derived
from a major world language and a distinctive grammar reflecting the languages of the
original speakers and the process of language
birth by creolization. This gives them a distinctive place in exploring the effectiveness of
transfer from large monolingual or multilingual pretrained models. While our work, along
with previous work, shows that transfer from
these models to low-resource languages that
are unrelated to languages in their training set
is not very effective, we would expect stronger
results from transfer to creoles. Indeed, our
experiments show considerably better results
from few-shot learning of JamPatoisNLI than
for such unrelated languages, and help us begin to understand how the unique relationship
between creoles and their high-resource base
languages affect cross-lingual transfer. JamPatoisNLI, which consists of naturally-occurring
premises and expert-written hypotheses, is a
step towards steering research into a traditionally underserved language and a useful benchmark for understanding cross-lingual NLP.

Variants: JamPatoisNLI

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Natural Language Inference roberta-unfrozen JamPatoisNLI: A Jamaican Patois Natural … 2022-12-07
Natural Language Inference bert-uncased-unfrozen JamPatoisNLI: A Jamaican Patois Natural … 2022-12-07

Research Papers

Recent papers with results on this dataset: