AVSD

Audio-Visual Scene-Aware Dialog

Dataset Information
Modalities
Videos, Texts, Audio
Languages
English
Introduced
2018
License
Unknown
Homepage

Overview

The Audio Visual Scene-Aware Dialog (AVSD) dataset, or DSTC7 Track 3, is a audio-visual dataset for dialogue understanding. The goal with the dataset and track was to design systems to generate responses in a dialog about a video, given the dialog history and audio-visual content of the video.

Source: The Eighth Dialog System Technology Challenge
Image Source: http://workshop.colips.org/dstc7/papers/DSTC7_Task_3_overview_paper.pdf

Variants: AVSD

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Scene-Aware Dialogue simple A Simple Baseline for Audio-Visual … 2019-06-01

Research Papers

Recent papers with results on this dataset: