AEmotionStudio's picture
Vendor Spotify Basic Pitch ICASSP 2022 ONNX for MAESTRO
327fd8c verified
metadata
license: apache-2.0
language:
  - en
tags:
  - music
  - midi
  - audio-to-midi
  - polyphonic-transcription
  - basic-pitch
  - onnx

Basic Pitch (ONNX) Mirror

Vendored copy of Spotify's Basic Pitch ICASSP 2022 polyphonic transcription model in ONNX format, re-hosted for use in the MAESTRO AI Workstation.

What this model does

Audio → MIDI polyphonic transcription for any pitched instrument: guitar, bass, vocals, synth, piano. Lightweight (~230 KB) and fast.

Why ONNX (not the pip package)?

The official basic-pitch PyPI package depends on tensorflow<2.15.1, which has no Python 3.14 wheels and would conflict with the MAESTRO backend's torch installation. Spotify ships the same model as a small ONNX export, which we serve here and run via onnxruntime — same model, no TensorFlow dependency chain.

Architecture

CNN spectrogram encoder + multi-head pitch/onset/note prediction. See the ICASSP 2022 paper for details.

License

Apache-2.0 — commercial-use OK.

Usage in MAESTRO

Loaded by backend/ai/models/basic_pitch.py via onnxruntime; surfaced in the AI Workstation's TranscribePanel under General / Drums / Vocals mode tabs.

Citation

@inproceedings{2022_BittnerBRME_LightweightNoteTranscription_ICASSP,
  title={A lightweight instrument-agnostic model for polyphonic note transcription and multipitch estimation},
  author={Bittner, Rachel M. and Bosch, Juan Jos{\'e} and Rubinstein, David and Meseguer-Brocal, Gabriel and Ewert, Sebastian},
  booktitle={ICASSP 2022},
  year={2022}
}