VRD

Visual Relationship Detection dataset

Dataset Information
Modalities
Images, Texts
Introduced
2016
License
Unknown
Homepage

Overview

The Visual Relationship Dataset (VRD) contains 4000 images for training and 1000 for testing annotated with visual relationships. Bounding boxes are annotated with a label containing 100 unary predicates. These labels refer to animals, vehicles, clothes and generic objects. Pairs of bounding boxes are annotated with a label containing 70 binary predicates. These labels refer to actions, prepositions, spatial relations, comparatives or preposition phrases. The dataset has 37993 instances of visual relationships and 6672 types of relationships. 1877 instances of relationships occur only in the test set and they are used to evaluate the zero-shot learning scenario.

Source: Compensating Supervision Incompleteness with Prior Knowledge in Semantic Image Interpretation
Image Source: https://cs.stanford.edu/people/ranjaykrishna/vrd/

Variants: VRD, VRD Predicate Detection, VRD Phrase Detection, VRD Relationship Detection

Associated Benchmarks

This dataset is used in 2 benchmarks:

Recent Benchmark Submissions

Task Model Paper Date
Visual Relationship Detection Ours - v Improving Visual Relation Detection using … 2019-05-02
Scene Graph Generation FactorizableNet Factorizable Net: An Efficient Subgraph-based … 2018-06-29
Scene Graph Generation VRD Visual Relationship Detection with Language … 2016-07-31

Research Papers

Recent papers with results on this dataset: