Visual Relationship Detection dataset
The Visual Relationship Dataset (VRD) contains 4000 images for training and 1000 for testing annotated with visual relationships. Bounding boxes are annotated with a label containing 100 unary predicates. These labels refer to animals, vehicles, clothes and generic objects. Pairs of bounding boxes are annotated with a label containing 70 binary predicates. These labels refer to actions, prepositions, spatial relations, comparatives or preposition phrases. The dataset has 37993 instances of visual relationships and 6672 types of relationships. 1877 instances of relationships occur only in the test set and they are used to evaluate the zero-shot learning scenario.
Source: Compensating Supervision Incompleteness with Prior Knowledge in Semantic Image Interpretation
Image Source: https://cs.stanford.edu/people/ranjaykrishna/vrd/
Variants: VRD, VRD Predicate Detection, VRD Phrase Detection, VRD Relationship Detection
This dataset is used in 2 benchmarks:
Task | Model | Paper | Date |
---|---|---|---|
Visual Relationship Detection | Ours - v | Improving Visual Relation Detection using … | 2019-05-02 |
Scene Graph Generation | FactorizableNet | Factorizable Net: An Efficient Subgraph-based … | 2018-06-29 |
Scene Graph Generation | VRD | Visual Relationship Detection with Language … | 2016-07-31 |
Recent papers with results on this dataset: