BS-RoFormer¶
Summary¶
Current state-of-the-art music source separation on MUSDB18HQ at 9.80 dB SDR (without extra training data). Introduces band-split processing — dividing the frequency axis into subbands — combined with hierarchical Rotary Position Embedding (RoPE) Transformers operating in the frequency domain. Won the SDX23 MSS track. ByteDance/SAMI, Sep 2023.
Key Claims¶
- Band-split processing (separate subband analysis) improves over full-band approaches
- RoPE Transformers in frequency domain capture long-range spectral dependencies
- Hierarchical architecture processes subbands at multiple resolutions
- SOTA without extra training data (9.80 dB vs. HTDemucs 9.20 dB with extra data)
Relevance to Bluegrass¶
Highest quality general-purpose separator. Multiple community reimplementations (most popular: lucidrains/BS-RoFormer, 811 stars). Same fundamental limitation as all fixed-stem separators: banjo/mandolin/fiddle all end up in "other."
The band-split approach might be particularly well-suited to acoustic instruments with distinct formant regions — worth investigating whether fine-tuning on bluegrass stems would benefit from the subband architecture.
Related¶
- ../entities/bs-roformer — entity page
- ../entities/demucs — previous SOTA, more accessible
- ../concepts/spectrogram-unets — earlier U-Net paradigm