VizWiz

Name: VizWiz
Published: 2018-01-01
License: CC BY 4.0

VizWiz-VQA

Dataset Information

Modalities

Images, Texts

Introduced

2018

License

CC BY 4.0

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

The VizWiz-VQA dataset originates from a natural visual question answering setting where blind people each took an image and recorded a spoken question about it, together with 10 crowdsourced answers per visual question. The proposed challenge addresses the following two tasks for this dataset: predict the answer to a visual question and (2) predict whether a visual question cannot be answered.

Source: https://vizwiz.org/tasks-and-datasets/vqa/
Image Source: https://vizwiz.org/tasks-and-datasets/vqa/

Variants: VizWiz Answer Differences 2019, VizWiz 2020 VQA, VizWiz 2020 test-dev, VizWiz 2020 test, VizWiz 2020 Answerability, VizWiz 2018 Answerability, VizWiz 2018, VizWiz

Associated Benchmarks

This dataset is used in 1 benchmark:

Visual Question Answering - Metrics: Accuracy

Recent Benchmark Submissions

Task	Model	Paper	Date
Visual Question Answering	Emu-I *	Emu: Generative Pretraining in Multimodality	2023-07-11

Research Papers

Recent papers with results on this dataset:

Emu: Generative Pretraining in Multimodality (2023) -

External Links:

VizWiz

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview