The TGIF-QA dataset contains 165K QA pairs for the animated GIFs from the TGIF dataset [Li et al. CVPR 2016]. The question & answer pairs are collected via crowdsourcing with a carefully designed user interface to ensure quality. The dataset can be used to evaluate video-based Visual Question Answering techniques.
Source: GitHub
Image Source: https://github.com/YunseokJANG/tgif-qa
Variants: TGIF-QA
This dataset is used in 2 benchmarks:
Task | Model | Paper | Date |
---|---|---|---|
Video Question Answering | LocVLM-Vid-B | Learning to Localize Objects Improves … | 2024-04-11 |
Visual Question Answering (VQA) | HiTeA | HiTeA: Hierarchical Temporal-Aware Video-Language Pre-training | 2022-12-30 |
Visual Question Answering (VQA) | InternVideo | InternVideo: General Video Foundation Models … | 2022-12-06 |
Recent papers with results on this dataset: