MusicBench

Dataset Information

Modalities

Texts, Audio, Music

Introduced

2023

License

CC

Homepage

Official Website

Contents

Overview
Associated Benchmarks
Recent Benchmark Submissions
Research Papers

Overview

The MusicBench dataset is a music audio-text pair dataset that was designed for text-to-music generation purpose and released along with Mustango text-to-music model. MusicBench is based on the MusicCaps dataset, which it expands from 5,521 samples to 52,768 training and 400 test samples!

Dataset Details
MusicBench expands MusicCaps by:

Including music features of chords, beats, tempo, and key that are extracted from the audio.
Describing these music features using text templates and thus enhancing the original text prompts.
Expanding the number of audio samples by performing musically meaningful augmentations: semitone pitch shifts, tempo changes, and volume changes.

Train set size = 52,768 samples Test set size = 400

This dataset also includes FMACaps, which was used as a second test set.

Variants: MusicBench

Associated Benchmarks

This dataset is used in 1 benchmark:

Text-to-Music Generation - Metrics: FAD

Recent Benchmark Submissions

Task	Model	Paper	Date
Text-to-Music Generation	Mustango (non-pretrained)	Mustango: Toward Controllable Text-to-Music Generation	2023-11-14

Research Papers

Recent papers with results on this dataset:

Mustango: Toward Controllable Text-to-Music Generation (2023) -

External Links:

MusicBench

Overview edit

Associated Benchmarks

Recent Benchmark Submissions

Research Papers

Edit Dataset Information

Overview