VIST

Name: VIST
Published: 2013-01-01
License: Unknown

Visual Storytelling

Dataset Information

Modalities

Images, Texts

Languages

English

Introduced

2013

License

Unknown

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

The Visual Storytelling Dataset (VIST) consists of 210,819 unique photos and 50,000 stories. The images were collected from albums on Flickr. The albums included 10 to 50 images and all the images in an album are taken in a 48-hour span. The stories were created by workers on Amazon Mechanical Turk, where the workers were instructed to choose five images from the album and write a story about them. Every story has five sentences, and every sentence is paired with its appropriate image. The dataset is split into 3 subsets, a training set (80%), a validation set (10%) and a test set (10%). All the words and interpunction signs in the stories are separated by a space character and all the location names are replaced with the word location. All the names of people are replaced with the words male or female depending on the gender of the person.

Source: Stories for Images-in-Sequence by using Visual and Narrative Components This research was partially funded by Pendulibrium and the Faculty of computer science and engineering, Ss. Cyril and Methodius University in Skopje.
Image Source: https://arxiv.org/pdf/1604.03968.pdf

Variants: VIST

Associated Benchmarks

This dataset is used in 2 benchmarks:

Story Continuation - Metrics: FID
Visual Storytelling - Metrics: BLEU-4, CIDEr, METEOR, BLEU-1, BLEU-2, BLEU-3, ROUGE-L, SPICE, BLEURT, MLTD

Recent Benchmark Submissions

Task	Model	Paper	Date
Story Continuation	AR-LDM (DII captions)	Synthesizing Coherent Story with Auto-Regressive …	2022-11-20
Story Continuation	AR-LDM (SIS captions)	Synthesizing Coherent Story with Auto-Regressive …	2022-11-20
Visual Storytelling	ViT-model	Vision Transformer Based Model for …	2022-10-06
Visual Storytelling	PR-VIST	Plot and Rework: Modeling Storylines …	2021-05-14
Visual Storytelling	MCSM+RNN	Commonsense Knowledge Aware Concept Selection …	2021-02-05
Visual Storytelling	hLSTMs	BERT-hLSTMs: BERT and Hierarchical LSTMs …	2020-12-03
Visual Storytelling	BERT-hLSTMs	BERT-hLSTMs: BERT and Hierarchical LSTMs …	2020-12-03
Visual Storytelling	INet	Hide-and-Tell: Learning to Bridge Photo …	2020-02-03
Visual Storytelling	StoryAnchor: w/ Predicted Nouns	Visual Storytelling via Predicting Anchor …	2020-01-13
Visual Storytelling	TAVST (RL)	Keep it Consistent: Topic-Aware Storytelling …	2019-11-11
Visual Storytelling	AREL	What Makes A Good Story? …	2019-09-11
Visual Storytelling	BLEU-RL	What Makes A Good Story? …	2019-09-11
Visual Storytelling	MLE	What Makes A Good Story? …	2019-09-11
Visual Storytelling	ReCo-RL	What Makes A Good Story? …	2019-09-11
Visual Storytelling	HSRL	What Makes A Good Story? …	2019-09-11
Visual Storytelling	VSCMR	Informative Visual Storytelling with Cross-modal …	2019-07-07
Visual Storytelling	CST	Contextualize, Show and Tell: A …	2018-06-03
Visual Storytelling	GLAC Net	GLAC Net: GLocal Attention Cascading …	2018-05-28
Visual Storytelling	HSRL w/ Joint Training	Hierarchically Structured Reinforcement Learning for …	2018-05-21
Visual Storytelling	AREL-t-100	No Metrics Are Perfect: Adversarial …	2018-04-24

Research Papers

Recent papers with results on this dataset:

External Links:

VIST

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview