MagnaTagATune

Dataset Information
Modalities
Audio
Introduced
2009
License
Unknown
Homepage

Overview

MagnaTagATune dataset contains 25,863 music clips. Each clip is a 29-seconds-long excerpt belonging to one of the 5223 songs, 445 albums and 230 artists. The clips span a broad range of genres like Classical, New Age, Electronica, Rock, Pop, World, Jazz, Blues, Metal, Punk, and more. Each audio clip is supplied with a vector of binary annotations of 188 tags. These annotations are obtained by humans playing the two-player online TagATune game. In this game, the two players are either presented with the same or a different audio clip. Subsequently, they are asked to come up with tags for their specific audio clip. Afterward, players view each other’s tags and are asked to decide whether they were presented the same audio clip. Tags are only assigned when more than two players agreed. The annotations include tags like ’singer’, ’no singer’, ’violin’, ’drums’, ’classical’, ’jazz’. The top 50 most popular tags are typically used for evaluation to ensure that there is enough training data for each tag. There are 16 parts, and researchers comonnly use parts 1-12 for training, part 13 for validation and parts 14-16 for testing.

Source: Brains on Beats
Audio Source: http://mirg.city.ac.uk/codeapps/the-magnatagatune-dataset

Variants: MagnaTagATune, MagnaTagATune (clean)

Associated Benchmarks

This dataset is used in 1 benchmark:

Recent Benchmark Submissions

Task Model Paper Date
Music Auto-Tagging M2D2 AS+ M2D2: Exploring General-purpose Audio-Language Representations … 2025-03-28
Music Auto-Tagging MATPAC (SSL, linear eval) Masked Latent Prediction and Classification … 2025-02-17
Music Auto-Tagging CLMR Contrastive Learning of Musical Representations 2021-03-17

Research Papers

Recent papers with results on this dataset: