Skip to content

AudioSep

About

Text-conditional source separation system (arXiv Aug 2023). Uses CLAP embeddings to project text queries and audio into shared space, then ResUNet decoder performs separation. "Separate Anything You Describe" — open-vocabulary queries instead of fixed stems. GitHub: Audio-AGI/AudioSep (1.9k stars). LGPL license. Pretrained checkpoints available.

Relevance

Most novel approach for bluegrass separation. In theory, querying "banjo" or "mandolin" could separate specific instruments without custom training. Untested on acoustic folk music — the CLAP embeddings may not generalize well to bluegrass instrument vocabulary.

Mentions