| --- |
| license: apache-2.0 |
| language: |
| - en |
| tags: |
| - music |
| - midi |
| - audio-to-midi |
| - polyphonic-transcription |
| - basic-pitch |
| - onnx |
| --- |
| |
| # Basic Pitch (ONNX) Mirror |
|
|
| Vendored copy of Spotify's [Basic Pitch](https://github.com/spotify/basic-pitch) |
| ICASSP 2022 polyphonic transcription model in ONNX format, re-hosted for use |
| in the [MAESTRO AI Workstation](https://github.com/AEmotionStudio). |
|
|
| ## What this model does |
|
|
| **Audio → MIDI polyphonic transcription** for any pitched instrument: |
| guitar, bass, vocals, synth, piano. Lightweight (~230 KB) and fast. |
|
|
| ## Why ONNX (not the pip package)? |
|
|
| The official `basic-pitch` PyPI package depends on `tensorflow<2.15.1`, |
| which has no Python 3.14 wheels and would conflict with the MAESTRO |
| backend's torch installation. Spotify ships the same model as a small |
| ONNX export, which we serve here and run via `onnxruntime` — same model, |
| no TensorFlow dependency chain. |
|
|
| ## Architecture |
|
|
| CNN spectrogram encoder + multi-head pitch/onset/note prediction. See the |
| [ICASSP 2022 paper](https://arxiv.org/abs/2203.09893) for details. |
|
|
| ## License |
|
|
| **Apache-2.0** — commercial-use OK. |
|
|
| ## Usage in MAESTRO |
|
|
| Loaded by `backend/ai/models/basic_pitch.py` via `onnxruntime`; surfaced |
| in the AI Workstation's `TranscribePanel` under General / Drums / Vocals |
| mode tabs. |
|
|
| ## Citation |
|
|
| ``` |
| @inproceedings{2022_BittnerBRME_LightweightNoteTranscription_ICASSP, |
| title={A lightweight instrument-agnostic model for polyphonic note transcription and multipitch estimation}, |
| author={Bittner, Rachel M. and Bosch, Juan Jos{\'e} and Rubinstein, David and Meseguer-Brocal, Gabriel and Ewert, Sebastian}, |
| booktitle={ICASSP 2022}, |
| year={2022} |
| } |
| ``` |
|
|