Skip to content

Guitar Tablature Estimation with a Convolutional Neural Network

Summary

TabCNN presents a convolutional neural network approach for estimating guitar tablature directly from audio — that is, predicting which string and fret position is being played at each time frame, rather than just predicting pitch and then assigning it to strings as a post-processing step. The model uses a Constant-Q Transform (CQT) representation as input, which is well-suited for musical signals because its frequency bins align with musical pitch. A CNN backbone processes the CQT spectrogram, with specialized output layers that jointly predict string and fret combinations. The model is trained and evaluated on the GuitarSet dataset, which provides hexaphonic (per-string) pickup recordings as ground truth.

For bluegrass banjo transcription, TabCNN's approach is directly applicable: banjo tablature follows the same string/fret paradigm as guitar tablature, just with different numbers of strings (5 instead of 6) and different tunings. The CQT input representation is naturally suited to banjo's pitch range. The key insight — that a CNN can learn to map audio directly to instrument-specific performance notation (string + fret) rather than going through an intermediate pitch representation — is transferable. The approach could be adapted by retraining on a banjo-specific dataset with per-string annotations. TabCNN establishes the baseline methodology for automated tablature transcription from audio, a field directly applicable to MusicXML tab notation generation.

Key Claims

  • First CNN-based approach for direct guitar tablature estimation from audio
  • Uses CQT (Constant-Q Transform) as musically-motivated input representation
  • Predicts string/fret combinations directly — no separate pitch detection + string assignment step
  • Trained and evaluated on GuitarSet with hexaphonic pickup ground truth
  • Demonstrates that deep learning can learn the physical constraints of a fretted instrument
  • Establishes baseline methodology for automated tablature transcription