MINGUS β Weimar Jazz Database
Trained checkpoints for MINGUS (Melodic Improvisation Neural Generator Using Seq2seq), retrained from scratch on the Weimar Jazz Database.
Used by the comparison pipeline of the diploma research kudrmax/jazz-generation-research alongside CMT and BebopNet.
Two ablation runs
We ran MINGUS on WjazzD with two conditioning configurations to investigate
whether the paper-reported optimal carries over to our test set (40 solos β
the cross-model intersection from
wjazzd_split.json,
not the 86-solo random 20% used in the paper).
| run | pitch cond | duration cond | pitch test_ppl | pitch test_acc | duration test_ppl | duration test_acc |
|---|---|---|---|---|---|---|
paper/ (default) |
I-C-NC-B-BE-O | I-C-NC-B-BE-O | 13.08 | 0.131 | 4.087 | 0.344 |
paper-optimal/ |
D-C-B-BE-O | B-BE-O | 13.53 | 0.120 | 4.048 | 0.345 |
| Madaghiele 2021 (paper) | D-C-B-BE-O | B-BE-O | 11.01 | 0.163 | 4.140 | 0.323 |
Observations. Duration model in both our runs reproduces / slightly beats the paper number (4.05β4.09 vs 4.14), confirming our pipeline is correct. The pitch model lags by ~20% on test perplexity for both conditioning choices β this is most plausibly explained by our test set being a different (and apparently harder) 40-solo subset than the paper's 86-solo random split, since the conditioning swap that's supposed to help didn't help on our set. We're keeping both runs for reference.
Files
paper/ # default conditioning (I-C-NC-B-BE-O for both)
βββ pitchModel/MINGUS COND I-C-NC-B-BE-O Epochs 10.pt
βββ durationModel/MINGUS COND I-C-NC-B-BE-O Epochs 10.pt
βββ pitch_state.pt # resume checkpoint
βββ duration_state.pt
paper-optimal/ # paper Β§3.2 optimal: pitch=D-C-B-BE-O, duration=B-BE-O
βββ pitchModel/MINGUS COND D-C-B-BE-O Epochs 10.pt
βββ durationModel/MINGUS COND B-BE-O Epochs 10.pt
βββ pitch_state.pt
βββ duration_state.pt
The two final .pt files in pitchModel/ and durationModel/ are what
MINGUS's authorial C_generate/generate.py and our GeneratorMingus wrapper
expect at those exact paths (with spaces in the filename, as published by the
authors).
The *_state.pt files are checkpoints produced by our resume-aware
B_train/train.py patch β useful only if you want to continue training from
epoch 10 instead of starting fresh.
Common training setup
- Architecture: two parallel Transformer phases (pitch model β duration model). 4 layers / 4 heads / d_model=200 each.
- Dataset: Weimar Jazz Database, 4/4 solos only. Split (train/val/test): 340 / 42 / 40. Test = canonical cross-model intersection (details in source repo).
- Training: 10 epochs per phase, BPTT 35, batch 20, SGD+momentum, StepLR.
Download
pip install -U huggingface_hub
# paper-optimal run (recommended for comparison-pipeline use)
hf download maxkudryashov/mingus-1 \
"paper-optimal/pitchModel/MINGUS COND D-C-B-BE-O Epochs 10.pt" \
"paper-optimal/durationModel/MINGUS COND B-BE-O Epochs 10.pt" \
--local-dir result
# default-conditioning run
hf download maxkudryashov/mingus-1 \
"paper/pitchModel/MINGUS COND I-C-NC-B-BE-O Epochs 10.pt" \
"paper/durationModel/MINGUS COND I-C-NC-B-BE-O Epochs 10.pt" \
--local-dir result
Reproducibility
Trained via the Colab notebook at
models/MINGUS/training/colab_train.ipynb
in our MINGUS fork. Notebook is idempotent: rerunning resumes from the last
completed epoch via <work_dir>/<phase>_state.pt.
paper/ was trained on Colab CPU runtime (628s pitch + 484s duration β 18 min).
paper-optimal/ was trained on Colab A100 GPU (53s pitch + 51s duration β 2 min).