NSynth is a dataset of one shot instrumental notes, containing 305,979 musical notes with unique pitch, timbre and envelope. The sounds were collected from 1006 instruments from commercial sample libraries and are annotated based on their source (acoustic, electronic or synthetic), instrument family and sonic qualities. The instrument families used in the annotation are bass, brass, flute, guitar, keyboard, mallet, organ, reed, string, synth lead and vocal. Four second monophonic 16kHz audio snippets were generated (notes) for the instruments.
Source: Data Augmentation for Instrument Classification Robust to Audio Effects
Image Source: https://magenta.tensorflow.org/nsynth
Variants: NSynth
This dataset is used in 1 benchmark:
Task | Model | Paper | Date |
---|---|---|---|
Instrument Recognition | M2D-CLAP | M2D2: Exploring General-purpose Audio-Language Representations … | 2025-03-28 |
Instrument Recognition | M2D2 AS+ | M2D2: Exploring General-purpose Audio-Language Representations … | 2025-03-28 |
Instrument Recognition | M2D AS | M2D2: Exploring General-purpose Audio-Language Representations … | 2025-03-28 |
Instrument Recognition | MATPAC (SSL, linear eval) | Masked Latent Prediction and Classification … | 2025-02-17 |
Instrument Recognition | melspect | EfficientLEAF: A Faster LEarnable Audio … | 2022-07-12 |
Instrument Recognition | EfficientLEAF | EfficientLEAF: A Faster LEarnable Audio … | 2022-07-12 |
Instrument Recognition | LEAF | EfficientLEAF: A Faster LEarnable Audio … | 2022-07-12 |
Recent papers with results on this dataset: