A GQA-based dataset with 1,040,830 multi-modal explanations of visual reasoning processes.
Variants: GQA-REX
This dataset is used in 1 benchmark:
Task | Model | Paper | Date |
---|---|---|---|
Explanatory Visual Question Answering | REX-LXMERT | REX: Reasoning-aware and Grounded Explanation | 2022-03-11 |
Explanatory Visual Question Answering | REX-VisualBert | REX: Reasoning-aware and Grounded Explanation | 2022-03-11 |
Explanatory Visual Question Answering | EXP | Faithful Multimodal Explanation for Visual … | 2018-09-08 |
Explanatory Visual Question Answering | VQAE | VQA-E: Explaining, Elaborating, and Enhancing … | 2018-03-20 |
Recent papers with results on this dataset: