Skip to content

Basic Pitch

Summary

Spotify's lightweight neural network for polyphonic note transcription. Instrument-agnostic — works on any pitched instrument without instrument-specific training. Uses harmonic CQT representation + CNN architecture. ~15MB model, runs on CPU in near real-time. Outputs MIDI with multi-pitch predictions and pitch bend detection. Presented at ICASSP 2022 (110+ citations).

Key Claims

  • Instrument-agnostic transcription is feasible with a single lightweight model
  • Harmonic stacking of CQT representations provides sufficient harmonic structure information
  • Pitch bend detection handles expressive playing (slides, bends, vibrato)
  • Consumer-grade performance: runs on CPU, ~15MB model size

Relevance to Banjo

The best starting point for banjo solo transcription. Instrument-agnostic means it works on banjo without modification. Pitch bend detection handles slides and chokes — essential for bluegrass banjo expression. Outputs MIDI directly.

Limitations: no string/fret assignment (tab output), no playing technique classification, polyphonic accuracy degrades with dense note clusters.

GitHub: spotify/basic-pitch (5k stars). Install: pip install basic-pitch.