SoundDescs

Dataset Information
Modalities
Texts, Audio
Languages
English
Introduced
2021
Homepage

Overview

We introduce a new audio dataset called SoundDescs that can be used for tasks such as text to audio retrieval, audio captioning etc. This dataset contains 32,979 pairs of audio files and text descriptions. There are 23 categories found in SoundDescs including but not limited to nature, clocks, fire etc.

SoundDescs can be downloaded from here and retrieval results for this dataset can be found in the associated paper Audio Retrieval with Natural Language Queries: A Benchmark Study.

Variants: SoundDescs

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Text to Audio Retrieval CE Audio Retrieval with Natural Language … 2021-12-17
Text to Audio Retrieval MoEE Audio Retrieval with Natural Language … 2021-12-17
Text to Audio Retrieval MMT Audio Retrieval with Natural Language … 2021-12-17
Text to Audio Retrieval CE(pretrained: AudioCaps) Audio Retrieval with Natural Language … 2021-12-17

Research Papers

Recent papers with results on this dataset: