AudioSep¶
About¶
Text-conditional source separation system (arXiv Aug 2023). Uses CLAP embeddings to project text queries and audio into shared space, then ResUNet decoder performs separation. "Separate Anything You Describe" — open-vocabulary queries instead of fixed stems. GitHub: Audio-AGI/AudioSep (1.9k stars). LGPL license. Pretrained checkpoints available.
Relevance¶
Most novel approach for bluegrass separation. In theory, querying "banjo" or "mandolin" could separate specific instruments without custom training. Untested on acoustic folk music — the CLAP embeddings may not generalize well to bluegrass instrument vocabulary.
Mentions¶
- ../sources/2026-05-19-audiosef — ingested summary
- ../entities/clap — underlying audio-language model
- ../concepts/query-based-source-separation — concept page
- ../entities/demucs — alternative fixed-stem approach