File size: 1,666 Bytes
327fd8c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
---
license: apache-2.0
language:
- en
tags:
- music
- midi
- audio-to-midi
- polyphonic-transcription
- basic-pitch
- onnx
---

# Basic Pitch (ONNX) Mirror

Vendored copy of Spotify's [Basic Pitch](https://github.com/spotify/basic-pitch)
ICASSP 2022 polyphonic transcription model in ONNX format, re-hosted for use
in the [MAESTRO AI Workstation](https://github.com/AEmotionStudio).

## What this model does

**Audio → MIDI polyphonic transcription** for any pitched instrument:
guitar, bass, vocals, synth, piano. Lightweight (~230 KB) and fast.

## Why ONNX (not the pip package)?

The official `basic-pitch` PyPI package depends on `tensorflow<2.15.1`,
which has no Python 3.14 wheels and would conflict with the MAESTRO
backend's torch installation. Spotify ships the same model as a small
ONNX export, which we serve here and run via `onnxruntime` — same model,
no TensorFlow dependency chain.

## Architecture

CNN spectrogram encoder + multi-head pitch/onset/note prediction. See the
[ICASSP 2022 paper](https://arxiv.org/abs/2203.09893) for details.

## License

**Apache-2.0** — commercial-use OK.

## Usage in MAESTRO

Loaded by `backend/ai/models/basic_pitch.py` via `onnxruntime`; surfaced
in the AI Workstation's `TranscribePanel` under General / Drums / Vocals
mode tabs.

## Citation

```
@inproceedings{2022_BittnerBRME_LightweightNoteTranscription_ICASSP,
  title={A lightweight instrument-agnostic model for polyphonic note transcription and multipitch estimation},
  author={Bittner, Rachel M. and Bosch, Juan Jos{\'e} and Rubinstein, David and Meseguer-Brocal, Gabriel and Ewert, Sebastian},
  booktitle={ICASSP 2022},
  year={2022}
}
```