PhotoChat

Dataset Information
Modalities
Images, Texts
Languages
English
Introduced
2021

Overview

PhotoChat, the first dataset that casts light on the photo sharing behavior in online messaging. PhotoChat contains 12k dialogues,
each of which is paired with a user photo that is shared during the conversation. Based on this dataset, we propose two tasks to facilitate research on image-text modeling: a
photo-sharing intent prediction task that predicts whether one intends to share a photo in the next conversation turn, and a photo retrieval task that retrieves the most relevant photo according to the dialogue context.

Variants: PhotoChat

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Image Retrieval PaCE PaCE: Unified Multi-modal Dialogue Pre-training … 2023-05-24
Image Retrieval VLMo VLMo: Unified Vision-Language Pre-Training with … 2021-11-03
Image Retrieval DE++ PhotoChat: A Human-Human Dialogue Dataset … 2021-07-06
Image Retrieval ViLT ViLT: Vision-and-Language Transformer Without Convolution … 2021-02-05
Image Retrieval SCAN Stacked Cross Attention for Image-Text … 2018-03-21

Research Papers

Recent papers with results on this dataset: