Audio-Visual Scene-Aware Dialog
The Audio Visual Scene-Aware Dialog (AVSD) dataset, or DSTC7 Track 3, is a audio-visual dataset for dialogue understanding. The goal with the dataset and track was to design systems to generate responses in a dialog about a video, given the dialog history and audio-visual content of the video.
Source: The Eighth Dialog System Technology Challenge
Image Source: http://workshop.colips.org/dstc7/papers/DSTC7_Task_3_overview_paper.pdf
Variants: AVSD
This dataset is used in 1 benchmark:
Task | Model | Paper | Date |
---|---|---|---|
Scene-Aware Dialogue | simple | A Simple Baseline for Audio-Visual … | 2019-06-01 |
Recent papers with results on this dataset: