The VQA-CP dataset was constructed by reorganizing VQA v2 such that the correlation between the question type and correct answer differs in the training and test splits. For example, the most common answer to questions starting with What sport… is tennis in the training set, but skiing in the test set. A model that guesses an answer primarily from the question will perform poorly.
Source: Unshuffling Data for Improved Generalization
Image Source: https://arxiv.org/pdf/1712.00377.pdf
Variants: VQA-CP
This dataset is used in 1 benchmark:
Task | Model | Paper | Date |
---|---|---|---|
Visual Question Answering (VQA) | GGE-DQ | Greedy Gradient Ensemble for Robust … | 2021-07-27 |
Visual Question Answering (VQA) | LMH+Entropy regularization (Ensemble) | Removing Bias in Multi-modal Classifiers: … | 2020-10-21 |
Visual Question Answering (VQA) | LMH+Entropy regularization | Removing Bias in Multi-modal Classifiers: … | 2020-10-21 |
Visual Question Answering (VQA) | CSS | Counterfactual Samples Synthesizing for Robust … | 2020-03-14 |
Visual Question Answering (VQA) | Learned-Mixin +H | Don't Take the Easy Way … | 2019-09-09 |
Visual Question Answering (VQA) | NSM | Learning by Abstraction: The Neural … | 2019-07-09 |
Visual Question Answering (VQA) | RUBi | RUBi: Reducing Unimodal Biases in … | 2019-06-24 |
Visual Question Answering (VQA) | UpDn+SCR (VQA-X) | Self-Critical Reasoning for Robust Visual … | 2019-05-24 |
Visual Question Answering (VQA) | MuRel | MUREL: Multimodal Relational Reasoning for … | 2019-02-25 |
Visual Question Answering (VQA) | HAN | Learning Visual Question Answering by … | 2018-08-01 |
Recent papers with results on this dataset: