VGG-Sound

Dataset Information
Modalities
Videos, Audio
License
Homepage

Overview

Consists of more than 210k videos for 310 audio classes.

Source: VGGSound: A Large-scale Audio-Visual Dataset

Variants: VGG-Sound, VGGSound

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Multi-modal Classification CAV-MAE (Audio-Visual) Contrastive Audio-Visual Masked Autoencoder 2022-10-02
Multi-modal Classification UAVM UAVM: Towards Unifying Audio and … 2022-07-29

Research Papers

Recent papers with results on this dataset: