Vietnamese Natural Language Inference Dataset
A large-scale and high-quality corpus is necessary for studies on NLI for Vietnamese, which can be considered a low-resource language. In this paper, we introduce ViNLI (Vietnamese Natural Language Inference), an open-domain and high-quality corpus for evaluating Vietnamese NLI models, which is created and evaluated with a strict process of quality control. ViNLI comprises over 30,000 human-annotated premise-hypothesis sentence pairs extracted from more than 800 online news articles on 13 distinct topics.
Variants: ViNLI
This dataset is used in 1 benchmark:
Task | Model | Paper | Date |
---|---|---|---|
Vietnamese Natural Language Inference | CafeBERT | VLUE: A New Benchmark and … | 2024-03-23 |
Recent papers with results on this dataset: