Vendor Spotify Basic Pitch ICASSP 2022 ONNX for MAESTRO

327fd8c verified 22 days ago

1.67 kB

	---
	license: apache-2.0
	language:
	- en
	tags:
	- music
	- midi
	- audio-to-midi
	- polyphonic-transcription
	- basic-pitch
	- onnx
	---

	# Basic Pitch (ONNX) Mirror

	Vendored copy of Spotify's [Basic Pitch](https://github.com/spotify/basic-pitch)
	ICASSP 2022 polyphonic transcription model in ONNX format, re-hosted for use
	in the [MAESTRO AI Workstation](https://github.com/AEmotionStudio).

	## What this model does

	Audio → MIDI polyphonic transcription for any pitched instrument:
	guitar, bass, vocals, synth, piano. Lightweight (~230 KB) and fast.

	## Why ONNX (not the pip package)?

	The official `basic-pitch` PyPI package depends on `tensorflow<2.15.1`,
	which has no Python 3.14 wheels and would conflict with the MAESTRO
	backend's torch installation. Spotify ships the same model as a small
	ONNX export, which we serve here and run via `onnxruntime` — same model,
	no TensorFlow dependency chain.

	## Architecture

	CNN spectrogram encoder + multi-head pitch/onset/note prediction. See the
	[ICASSP 2022 paper](https://arxiv.org/abs/2203.09893) for details.

	## License

	Apache-2.0 — commercial-use OK.

	## Usage in MAESTRO

	Loaded by `backend/ai/models/basic_pitch.py` via `onnxruntime`; surfaced
	in the AI Workstation's `TranscribePanel` under General / Drums / Vocals
	mode tabs.

	## Citation

	```
	@inproceedings{2022_BittnerBRME_LightweightNoteTranscription_ICASSP,
	title={A lightweight instrument-agnostic model for polyphonic note transcription and multipitch estimation},
	author={Bittner, Rachel M. and Bosch, Juan Jos{\'e} and Rubinstein, David and Meseguer-Brocal, Gabriel and Ewert, Sebastian},
	booktitle={ICASSP 2022},
	year={2022}
	}
	```