Skip to content

Topic: Source Separation

Overview

Audio source separation — isolating individual sound sources from mixtures. Covers deep learning approaches (Mask Inference, spectrogram prediction, waveform models), classical techniques, toolkits, and training paradigms like mixture invariant training.

Sub-topics / Concepts

Key Entities

Models

  • ../entities/demucs / ../entities/htdemucs (Meta) — Hybrid Transformer Demucs. Waveform-domain separation with transformer layers. State-of-the-art on MUSDB18. HTDemucs adds hybrid spectrogram/waveform processing.
  • ../entities/spleeter (Deezer) — Fast, pretrained separation library. U-Net based, separates into 2/4/5 stems. Widely used in production.
  • ../entities/open-unmix — Open-source reference implementation for music separation. BiLSTM-based, reproducible, strong baseline on MUSDB18.
  • ../entities/audiosef — AudioSep: text-queried / conditional source separation using CLAP embeddings.
  • ../entities/soundfilter (Google) — SoundFilter: conditional separation via learned filter networks.

Toolkits

  • ../entities/nussl (Northwestern / Interactive Audio Lab) — Comprehensive separation toolkit. Deep clustering, deep attractor networks, Mask Inference. Educational + research focus.
  • ../entities/asteroid — PyTorch-based source separation toolkit. Modular: datasets, architectures, training recipes. Built on PyTorch-Lightning.

Datasets (see also ../topics/datasets)

  • ../entities/musdb18 — Standard benchmark for music source separation (4 stems: drums, bass, vocals, other).
  • ../entities/slakh2100 — Synthesized multi-track dataset with 2100 tracks, individual instrument stems.

Sources

None ingested yet — seed batch setup.

Open Questions

  • HTDemucs vs Spleeter on non-music audio (field recordings, podcasts)?
  • How well does text-conditional separation (AudioSep) work for instrument-specific queries?
  • Does the MixIT paradigm remove the need for isolated-stem training data?
  • What's the practical latency/throughput tradeoff for real-time separation?