MusicBench

Dataset Information
Modalities
Texts, Audio, Music
Introduced
2023
License
CC
Homepage

Overview

The MusicBench dataset is a music audio-text pair dataset that was designed for text-to-music generation purpose and released along with Mustango text-to-music model. MusicBench is based on the MusicCaps dataset, which it expands from 5,521 samples to 52,768 training and 400 test samples!

Dataset Details
MusicBench expands MusicCaps by:

Including music features of chords, beats, tempo, and key that are extracted from the audio.
Describing these music features using text templates and thus enhancing the original text prompts.
Expanding the number of audio samples by performing musically meaningful augmentations: semitone pitch shifts, tempo changes, and volume changes.

Train set size = 52,768 samples Test set size = 400

This dataset also includes FMACaps, which was used as a second test set.

Variants: MusicBench

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Text-to-Music Generation Mustango (non-pretrained) Mustango: Toward Controllable Text-to-Music Generation 2023-11-14

Research Papers

Recent papers with results on this dataset: