STRUM β€” Spectral Transcription & Rhythm Understanding Model

End-to-end pipeline that turns a song (.wav / .mp3 / .ogg) into a fully playable Clone Hero / YARG chart package: drums, guitar, bass, vocals (with lyrics), and keys.

Source: https://github.com/opria123/strum

What's in this repo

Folder What it is Used by
drums/drums_v14/ TwoStageDrumsCRNN onset detector (mel input, 22050 Hz) batch_infer_hybrid.py Stage 1
drums/drums_mc_onset/ Multi-class onset head fine-tuned on V14 backbone Stage-1 alt head
drums/drums_phase3/ Phase-3 multi-class rescue model Late-stage rescue / reclassify
drums/drums_cymbal_onset/ Cymbal-specialist onset head Cymbal-specific rescue
drums/tom_refinement_demucs/ Tom vs. cymbal CNN running on Demucs drum stem Tom/cymbal disambiguation
drums_classifier_ensemble/ 6-model OnsetClassifier ensemble (V2, V4, V6, V12c, V15, V16) + V17 Per-onset 8-lane classification
guitar/guitar_v2_onset/ Guitar onset CRNN (Event F1 0.81) Hybrid guitar pipeline
guitar/fret_mapper_v4.pt Pitch β†’ 5-fret mapper (replaces librosa rule mapper) Hybrid guitar pipeline
section_classifier/ Verse/chorus/bridge section labeler Chart sections

Performance

Held-out test set (from 3,299 human-authored Pro Drum charts):

Component Metric Score
Drums onset detection (V14) Frame F1 93.9%
Drums lane classification (6-ensemble) Per-onset F1 85.2%

End-to-end vs ground-truth Clone Hero / YARG charts on an in-envelope benchmark of 29 songs sampled from a 3,299-song held-out pool. Songs were pre-screened with a single audio-feature gate (median Demucs htdemucs_6s drum-stem RMS β‰₯ 0.018, 1 s windows at 22050 Hz mono). Eval is Expert difficulty, Β±100 ms tolerance, with a per-song global offset search (Β±200 ms / 10 ms steps).

Instrument F1 Precision Recall
Drums 83.8% 82.4% 85.4%
Guitar 65.1% 74.5% 57.8%
Bass 69.4% 65.8% 73.4%
Vocals 53.9% 63.2% 47.0%

See the source repo's benchmark_results.json for per-song breakdown and scripts/eval_benchmark.py for the harness.

Usage

The checkpoints are loaded by the STRUM pipeline scripts. Clone the repo and download the checkpoints into checkpoints/ preserving the layout:

git clone https://github.com/opria123/strum
cd strum
python -m venv .venv && source .venv/bin/activate
pip install -e .

# Pull weights from the Hub
huggingface-cli download opria123/strum --local-dir checkpoints/ \
    --local-dir-use-symlinks False

# Run the full pipeline on a folder of audio files
python scripts/batch_pipeline.py /path/to/songs /path/to/charts

The pipeline expects this layout (mirrors the drums/ and guitar/ subfolders here, just under checkpoints/):

checkpoints/
β”œβ”€β”€ drums_v14/best.pt
β”œβ”€β”€ drums_mc_onset/best.pt
β”œβ”€β”€ drums_phase3/best.pt
β”œβ”€β”€ drums_cymbal_onset/best_union_f1.pt
β”œβ”€β”€ tom_refinement_demucs/best.pt
β”œβ”€β”€ onset_classifier/best_f1.pt
β”œβ”€β”€ onset_classifier_v4/best_f1.pt
β”œβ”€β”€ onset_classifier_v6/best_f1.pt
β”œβ”€β”€ onset_classifier_v12_clean/best_f1.pt
β”œβ”€β”€ onset_classifier_v12c_community/best_f1.pt
β”œβ”€β”€ onset_classifier_v15/best_f1.pt
β”œβ”€β”€ onset_classifier_v16/best_f1.pt
β”œβ”€β”€ onset_classifier_v17/best_f1.pt
β”œβ”€β”€ guitar_v2/guitar_v2_onset/best.pt
β”œβ”€β”€ fret_mapper_v4.pt
└── section_classifier/best.pt

A small reorganisation script scripts/sync_from_hf.sh in the source repo handles the drums/ β†’ flat-checkpoints/ mapping.

License

MIT. See the source repository for full attribution of the underlying training data (Clone Hero / YARG community charters) and dependencies (Demucs v4, librosa, OpenAI Whisper, Spotify Basic Pitch).

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support