OK-VQA

Outside Knowledge Visual Question Answering

Dataset Information
Modalities
Images, Texts
Languages
English
License
Unknown
Homepage

Overview

Outside Knowledge Visual Question Answering (OK-VQA) includes more than 14,000 questions that require external knowledge to answer.

Source: OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge
Image Source: https://okvqa.allenai.org/

Variants: OK-VQA

Associated Benchmarks

This dataset is used in 2 benchmarks:

Recent Benchmark Submissions

Task Model Paper Date
Visual Question Answering (VQA) HYDRA HYDRA: A Hyper Agent for … 2024-03-19
Visual Question Answering (VQA) Lyrics Lyrics: Boosting Fine-grained Language-Vision Alignment … 2023-12-08
Visual Question Answering (VQA) PaLI-X-VPD Visual Program Distillation: Distilling Tools … 2023-12-05
Visual Question Answering (VQA) A Simple Baseline for KB-VQA A Simple Baseline for Knowledge-Based … 2023-10-20
Retrieval FLMR Fine-grained Late-interaction Multi-modal Retrieval for … 2023-09-29
Visual Question Answering (VQA) RA-VQA-v2 (T5-large) Fine-grained Late-interaction Multi-modal Retrieval for … 2023-09-29
Visual Question Answering (VQA) RA-VQA-v2 (BLIP 2) Fine-grained Late-interaction Multi-modal Retrieval for … 2023-09-29
Visual Question Answering (VQA) PaLI-X (Single-task FT) PaLI-X: On Scaling up a … 2023-05-29
Visual Question Answering (VQA) PaLM-E-562B PaLM-E: An Embodied Multimodal Language … 2023-03-06
Visual Question Answering (VQA) Prophet Prophet: Prompting Large Language Models … 2023-03-03
Visual Question Answering (VQA) VK-OOD Differentiable Outlier Detection Enable Robust … 2023-02-11
Visual Question Answering (VQA) BLIP-2 ViT-G FlanT5 XXL (zero-shot) BLIP-2: Bootstrapping Language-Image Pre-training with … 2023-01-30
Visual Question Answering (VQA) BLIP-2 ViT-G FlanT5 XL (zero-shot) BLIP-2: Bootstrapping Language-Image Pre-training with … 2023-01-30
Visual Question Answering (VQA) BLIP-2 ViT-L OPT 2.7B (zero-shot) BLIP-2: Bootstrapping Language-Image Pre-training with … 2023-01-30
Visual Question Answering (VQA) BLIP-2 ViT-G OPT 2.7B (zero-shot) BLIP-2: Bootstrapping Language-Image Pre-training with … 2023-01-30
Visual Question Answering (VQA) BLIP-2 ViT-G OPT 6.7B (zero-shot) BLIP-2: Bootstrapping Language-Image Pre-training with … 2023-01-30
Visual Question Answering (VQA) BLIP-2 ViT-L FlanT5 XL (zero-shot) BLIP-2: Bootstrapping Language-Image Pre-training with … 2023-01-30
Visual Question Answering (VQA) ReVeaL WIT + CC12M + Wikidata + VQA-2 REVEAL: Retrieval-Augmented Visual-Language Pre-Training with … 2022-12-10
Visual Question Answering (VQA) PromptCap PromptCap: Prompt-Guided Task-Aware Image Captioning 2022-11-15
Visual Question Answering (VQA) VLC-BERT VLC-BERT: Visual Question Answering with … 2022-10-24

Research Papers

Recent papers with results on this dataset: