Skip to content

Topic: Datasets

Overview

Key datasets for training and evaluating audio separation, transcription, classification, and representation learning models.

Sub-topics / Concepts

Key Entities

Separation & Transcription

  • ../entities/musdb18 — MUSDB18: Standard benchmark for music source separation. 150 tracks, 4 stereo stems. The ImageNet of source separation.
  • ../entities/slakh2100 — Slakh2100: Synthesized multi-track dataset. 2100 tracks with individual instrument stems.
  • ../entities/guitarset — GuitarSet: Solo guitar recordings with hexaphonic pickup, per-string audio, playing technique annotations.

Classification & Captioning

General Music / Multi-track

Sources

None ingested yet — seed batch setup.

Open Questions

  • What is the legal/licensing status of each dataset for commercial use?
  • How well do models trained on Slakh2100 (synthetic) generalize vs. MUSDB18-trained models?
  • GuitarSet is small — what data augmentation strategies work for guitar transcription?