Task Directed Image Understanding Challenge
Task Directed Image Understanding Challenge (TDIUC) dataset is a Visual Question Answering dataset which consists of 1.6M questions and 170K images sourced from MS COCO and the Visual Genome Dataset. The image-question pairs are split into 12 categories and 4 additional evaluation matrices which help evaluate models’ robustness against answer imbalance and its ability to answer questions that require higher reasoning capability. The TDIUC dataset divides the VQA paradigm into 12 different task directed question types. These include questions that require a simpler task (e.g., object presence, color attribute) and more complex tasks (e.g., counting, positional reasoning). The dataset includes also an “Absurd” question category in which questions are irrelevant to the image contents to help balance the dataset.
Source: Question-Agnostic Attention for Visual Question Answering
Image Source: https://kushalkafle.com/projects/tdiuc.html
Variants: TDIUC
This dataset is used in 1 benchmark:
Task | Model | Paper | Date |
---|---|---|---|
Visual Question Answering (VQA) | BAN2-CTI | Compact Trilinear Interaction for Visual … | 2019-09-26 |
Visual Question Answering (VQA) | Accuracy | MUREL: Multimodal Relational Reasoning for … | 2019-02-25 |
Recent papers with results on this dataset: