Permutation Invariant Training¶
Definition¶
A training paradigm for source separation models that handles the ambiguity of which output corresponds to which source. Instead of requiring a fixed source-to-output mapping, PIT computes loss over all permutations of output-to-reference assignments and uses the minimum.
Key Ideas¶
- Core problem: a separation model outputs N sources, but the order is arbitrary. A fixed loss (output-1 to source-A) creates a permutation-dependent training signal.
- PIT: For each training example, compute loss for all N! permutations of output-to-reference mappings. Use the permutation with minimum loss.
- uPIT (utterance-level PIT): Applies PIT at the utterance level rather than frame level, reducing permutation switching artifacts.
- Key enabler for deep learning-based separation — without PIT, models struggle with source permutation ambiguity.
Relationships¶
- Introduced by john-hershey and colleagues
- Foundational to modern separation models (Demucs, Open-Unmix)
- Related to ../concepts/deep-clustering-separation — alternative approach to permutation problem via embedding + clustering
Sources¶
None ingested yet — seed batch setup.