e-SNLI-VE is a large VL (vision-language) dataset with NLEs (natural language explanations) with over 430k instances for which the explanations rely on the image content. It has been built by merging the explanations from e-SNLI and the image-sentence pairs from SNLI-VE.
Variants: e-SNLI-VE
This dataset is used in 1 benchmark:
Task | Model | Paper | Date |
---|---|---|---|
Explanation Generation | OFA-X | Harnessing the Power of Multi-Task … | 2022-12-08 |
Explanation Generation | OFA-X-MT | Harnessing the Power of Multi-Task … | 2022-12-08 |
Recent papers with results on this dataset: