Synthetic Mixing / Data Augmentation Pipelines¶
Definition¶
Creating training data for source separation by artificially mixing isolated stems with randomized gains, effects, and spatialization. Multiplies available training data and improves generalization.
Key Ideas¶
- Start with isolated stems (from datasets like MUSDB18, Slakh2100, or solo recordings).
- Apply random gains, panning, EQ, reverb (via ../entities/pedalboard, ../entities/echothief, ../entities/openair) to each stem.
- Sum into mixtures — ground truth stems are known by construction.
- Slakh2100 takes this further — synthesizes stems from MIDI with virtual instruments, then mixes.
- Critical for domains with limited multi-track data (e.g., guitar transcription from GuitarSet).
Relationships¶
- Uses ../entities/pedalboard, ../entities/echothief, ../entities/openair
- Related to ../concepts/mixture-invariant-training — MixIT can train without isolated stems
- Datasets: ../entities/musdb18, ../entities/slakh2100, ../entities/guitarset
Sources¶
None ingested yet — seed batch setup.