MIT-States

Dataset Information
Modalities
Images
Introduced
2015
License
Unknown
Homepage

Overview

The MIT-States dataset has 245 object classes, 115 attribute classes and ∼53K images. There is a wide range of objects (e.g., fish, persimmon, room) and attributes (e.g., mossy, deflated, dirty). On average, each object instance is modified by one of the 9 attributes it affords.

Source: Attributes as Operators: Factorizing Unseen Attribute-Object Compositions
Image Source: http://web.mit.edu/phillipi/Public/states_and_transformations/index.html

Variants: MIT-States, MIT-States, generalized split

Associated Benchmarks

This dataset is used in 2 benchmarks:

Recent Benchmark Submissions

Task Model Paper Date
Zero-Shot Learning CZSL LOCL: Learning Object-Attribute Composition using … 2022-10-07
Image Retrieval with Multi-Modal Query ComposeAE Compositional Learning of Image-Text Query … 2020-06-19
Image Retrieval with Multi-Modal Query TIRG Composing Text and Image for … 2018-12-18
Image Retrieval with Multi-Modal Query Attribute as Operator Attributes as Operators: Factorizing Unseen … 2018-03-27
Image Retrieval with Multi-Modal Query FiLM FiLM: Visual Reasoning with a … 2017-09-22
Image Retrieval with Multi-Modal Query Show and Tell Show and Tell: A Neural … 2014-11-17

Research Papers

Recent papers with results on this dataset: