Skip to content

Deep Clustering for Source Separation

Definition

Source separation approach where a neural network learns to embed each time-frequency bin into a space where bins belonging to the same source cluster together. Separation is performed by clustering the embeddings, then using cluster assignments as masks.

Key Ideas

  • Hershey et al. (2016). Each T-F bin gets a D-dimensional embedding.
  • Training objective: minimize distance between embeddings of bins belonging to the same source, maximize distance between different sources.
  • At inference: run k-means on embeddings to get cluster assignments, use as binary masks.
  • Elegantly handles the permutation problem — clustering is permutation-invariant.
  • Foundation for later work: deep attractor networks, anchored deep clustering.

Relationships

Sources

None ingested yet — seed batch setup.