Demucs / HTDemucs¶

Summary¶

Meta's Demucs family represents the dominant open-source source separation lineage. Demucs (2019/2021) introduced competitive waveform-domain separation using a U-Net with bidirectional LSTM, reaching 6.3 dB SDR on MUSDB18. Hybrid Demucs (2021) let the model choose between spectrogram and waveform domain processing, winning the MDX 2021 competition with +1.4 dB improvement. HTDemucs (2022) replaced innermost U-Net layers with cross-domain Transformer encoders (self-attention + cross-attention across time/frequency), reaching 9.20 dB SDR with extra training data — SOTA at publication.

Key Claims¶

Waveform-domain separation can match or exceed spectrogram-domain approaches
Hybrid spectrogram/waveform processing outperforms either alone
Cross-domain Transformers (attending across time and frequency representations) improve over pure convolutional U-Nets
4-stem (vocals/bass/drums/other) and 6-stem (+guitar/piano) pretrained models available

Relevance to Bluegrass¶

Best general-purpose separator available via pip install demucs. Critical limitation: all non-vocal/bass/drums instruments land in "other" — banjo, mandolin, and fiddle are not distinguished. The "other" stem would need further processing.

Repo archived by Meta Jan 2025 but still functional. GitHub: facebookresearch/demucs (10.1k stars).

../entities/demucs — entity page
../concepts/spectrogram-unets — U-Net architecture family
../entities/bs-roformer — current SOTA, outperforms HTDemucs
../entities/spleeter — alternative, lower quality but faster

Demucs / HTDemucs¶

Summary¶

Key Claims¶

Relevance to Bluegrass¶

Related¶