VocalSound

Dataset Information
Modalities
Audio
Introduced
2022
License
Homepage

Overview

VocalSound is a free dataset consisting of 21,024 crowdsourced recordings of laughter, sighs, coughs, throat clearing, sneezes, and sniffs from 3,365 unique subjects. The VocalSound dataset also contains meta-information such as speaker age, gender, native language, country, and health condition.

Variants: VocalSound

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Audio Classification Qwen-Audio Qwen-Audio: Advancing Universal Audio Understanding … 2023-11-14
Audio Classification VocalSound Baseline Vocalsound: A Dataset for Improving … 2022-05-06

Research Papers

Recent papers with results on this dataset: