SciTail

Dataset Information
Modalities
Texts
Languages
English
License
Unknown
Homepage

Overview

The SciTail dataset is an entailment dataset created from multiple-choice science exams and web sentences. Each question and the correct answer choice are converted into an assertive statement to form the hypothesis. We use information retrieval to obtain relevant text from a large text corpus of web sentences, and use these sentences as a premise P. We crowdsource the annotation of such premise-hypothesis pair as supports (entails) or not (neutral), in order to create the SciTail dataset. The dataset contains 27,026 examples with 10,101 examples with entails label and 16,925 examples with neutral label.

Source: Allen Institute for AI
Image source: Allen Institute for AI

Variants: SciTail

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Natural Language Inference SplitEE-S SplitEE: Early Exit in Deep … 2023-09-17
Natural Language Inference CA-MTL Conditionally Adaptive Multi-Task Learning: Improving … 2020-09-19
Natural Language Inference MT-DNN-SMARTLARGEv0 SMART: Robust and Efficient Fine-Tuning … 2019-11-08
Natural Language Inference MT-DNN-SMART_100%ofTrainingData SMART: Robust and Efficient Fine-Tuning … 2019-11-08
Natural Language Inference MT-DNN-SMART_10%ofTrainingData SMART: Robust and Efficient Fine-Tuning … 2019-11-08
Natural Language Inference MT-DNN-SMART_1%ofTrainingData SMART: Robust and Efficient Fine-Tuning … 2019-11-08
Natural Language Inference MT-DNN-SMART_0.1%ofTrainingData SMART: Robust and Efficient Fine-Tuning … 2019-11-08
Natural Language Inference RE2 Simple and Effective Text Matching … 2019-08-01
Natural Language Inference MT-DNN Multi-Task Deep Neural Networks for … 2019-01-31
Natural Language Inference Hierarchical BiLSTM Max Pooling Sentence Embeddings in NLI with … 2018-08-27
Natural Language Inference CAFE Compare, Compress and Propagate: Enhancing … 2017-12-30

Research Papers

Recent papers with results on this dataset: