VQA-CP

Dataset Information
Modalities
Images, Texts
Introduced
2018
License
Unknown
Homepage

Overview

The VQA-CP dataset was constructed by reorganizing VQA v2 such that the correlation between the question type and correct answer differs in the training and test splits. For example, the most common answer to questions starting with What sport… is tennis in the training set, but skiing in the test set. A model that guesses an answer primarily from the question will perform poorly.

Source: Unshuffling Data for Improved Generalization
Image Source: https://arxiv.org/pdf/1712.00377.pdf

Variants: VQA-CP

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Visual Question Answering (VQA) GGE-DQ Greedy Gradient Ensemble for Robust … 2021-07-27
Visual Question Answering (VQA) LMH+Entropy regularization (Ensemble) Removing Bias in Multi-modal Classifiers: … 2020-10-21
Visual Question Answering (VQA) LMH+Entropy regularization Removing Bias in Multi-modal Classifiers: … 2020-10-21
Visual Question Answering (VQA) CSS Counterfactual Samples Synthesizing for Robust … 2020-03-14
Visual Question Answering (VQA) Learned-Mixin +H Don't Take the Easy Way … 2019-09-09
Visual Question Answering (VQA) NSM Learning by Abstraction: The Neural … 2019-07-09
Visual Question Answering (VQA) RUBi RUBi: Reducing Unimodal Biases in … 2019-06-24
Visual Question Answering (VQA) UpDn+SCR (VQA-X) Self-Critical Reasoning for Robust Visual … 2019-05-24
Visual Question Answering (VQA) MuRel MUREL: Multimodal Relational Reasoning for … 2019-02-25
Visual Question Answering (VQA) HAN Learning Visual Question Answering by … 2018-08-01

Research Papers

Recent papers with results on this dataset: