IllusionVQA

Dataset Information
Modalities
Images, Texts
Languages
English
Introduced
2024
License
Homepage

Overview

IllusionVQA is a Visual Question Answering (VQA) dataset with two sub-tasks. The first task tests comprehension on 435 instances in 12 optical illusion categories. Each instance consists of an image with an optical illusion, a question, and 3 to 6 options, one of which is the correct answer. We refer to this task as Logo IllusionVQA-Comprehension. The second task tests how well VLMs can differentiate geometrically impossible objects from ordinary objects when two objects are presented side by side. The task consists of 1000 instances following a similar format to the first task. We refer to this task as Logo IllusionVQA-Soft-Localization.

Variants: IllusionVQA

Associated Benchmarks

This dataset is used in 2 benchmarks:

Recent Benchmark Submissions

Task Model Paper Date
Visual Question Answering (VQA) GPT4-Vision 4-shot IllusionVQA: A Challenging Optical Illusion … 2024-03-23
Visual Question Answering (VQA) GPT4-Vision IllusionVQA: A Challenging Optical Illusion … 2024-03-23
Visual Question Answering (VQA) Gemini-Pro 4-shot IllusionVQA: A Challenging Optical Illusion … 2024-03-23
Visual Question Answering (VQA) Gemini-Pro IllusionVQA: A Challenging Optical Illusion … 2024-03-23
Visual Question Answering (VQA) LLaVA-1.5-13B IllusionVQA: A Challenging Optical Illusion … 2024-03-23
Visual Question Answering (VQA) CogVLM IllusionVQA: A Challenging Optical Illusion … 2024-03-23
Visual Question Answering (VQA) InstructBLIP-13B IllusionVQA: A Challenging Optical Illusion … 2024-03-23
Object Localization Gemini-Pro IllusionVQA: A Challenging Optical Illusion … 2024-03-23
Object Localization Gemini-Pro 4-shot IllusionVQA: A Challenging Optical Illusion … 2024-03-23
Object Localization GPT4-Vision IllusionVQA: A Challenging Optical Illusion … 2024-03-23
Object Localization Gemini-Pro 4-shot+CoT IllusionVQA: A Challenging Optical Illusion … 2024-03-23
Object Localization GPT4-Vision 4-shot IllusionVQA: A Challenging Optical Illusion … 2024-03-23
Object Localization GPT4-Vision 4-shot+CoT IllusionVQA: A Challenging Optical Illusion … 2024-03-23
Object Localization CogVLM IllusionVQA: A Challenging Optical Illusion … 2024-03-23
Object Localization LLaVA-1.5-13B IllusionVQA: A Challenging Optical Illusion … 2024-03-23
Object Localization InstructBLIP-13B IllusionVQA: A Challenging Optical Illusion … 2024-03-23

Research Papers

Recent papers with results on this dataset: