--- license: mit language: - en tags: - music - audio - commodore-64 - sid - chiptune - generative - gpt2 - transformer --- # SID-GPT 25M A GPT model trained to generate Commodore 64 SID music by learning from legendary composers. **[Listen to samples](#audio-samples)** | **[GitHub](https://github.com/M64GitHub/SidGPT)** ## Model Description SID-GPT learns to predict SID register states frame-by-frame, essentially learning the "language" of C64 chiptune music. Trained on 2,410 songs from HVSC, it produces output with recognizable musical structures: kick drums, PWM sweeps, basslines, and arpeggios. | Parameter | Value | |-----------|-------| | Parameters | 25.7M | | Architecture | 8 layers, 8 heads, 512 embedding | | Block Size | 1020 tokens (20 frames) | | Effective Context | 12 frames (0.24 sec) | | Vocabulary | 22 tokens | | Validation Loss | 0.207 | | Training Time | 31 hours on M4 MacBook | ## Training Data - **Source**: [HVSC](https://hvsc.c64.org/) (High Voltage SID Collection) - **Size**: 1GB of register dump sequences (2,410 SID files) - **Composers**: DRAX (530 songs), Laxity (287), Rob Hubbard (96), Jeroen Tel (176), Martin Galway (40), and 10 others ## Files | File | Size | Description | |------|------|-------------| | `sid-gpt-xxxx.bin` | 98 MB | Exported weights for Zig inference | | `sid-gpt-xxxx.pt` | 295 MB | PyTorch checkpoint (includes optimizer state) | | `config.json` | 1 KB | Model configuration | ## Usage ### Zig Inference Engine (Recommended) The native Zig engine runs at ~350-120 tok/s with SIMD and KV caching, depending on context window: ```bash # Clone repository git clone https://github.com/M64GitHub/SidGPT cd SidGPT zig build -Doptimize=ReleaseFast # Download model wget https://huggingface.co/M64/sid-gpt-25m/resolve/main/sid-gpt-1700.bin -P models/ # Generate and play ./zig-out/bin/sidgpt --model models/sid-gpt-1700.bin --frames 700 --temp 0.90 --seed 7391738265 --context 12 | ./zig-out/bin/sidgpt-play # Or export to WAV ./zig-out/bin/sidgpt --model models/sid-gpt-1700.bin --frames 700 --temp 0.90 --seed 7391738265 --context 12 --output music.txt ./zig-out/bin/sidgpt-play music.txt --output-wav music.wav ``` ### Python Inference ```bash cd training python sample_sid.py --checkpoint path/to/sid-gpt-1700.pt --num_frames 700 --temperature 0.95 ``` ## Generation Tips **Good seeds to try**: 1337, 7391738264, 7391738265, 4829173650 ## Audio Samples Generated outputs from this model: | Sample | Seed | Temp | Description | |--------|------|------|-------------| | [test.wav](samples/test.wav) | 7391738265 | 0.95 | Melodic arps with bassline and kicks | ## Proof of Concept Status Despite only 12 frames (0.24 sec) of context, the model learned real SID techniques: - **Kick drums** - Pulse wave frequency sweeps transitioning to noise - **PWM sweeps** - Pulse width modulation fades (Rob Hubbard signature) - **Basslines** - Melodic bass patterns with movement - **Arpeggios** - Fast note sequences typical of SID music - **Leads** - Fading-in lead voices ## Limitations - **Short context**: 12 frames = no long-range song structure - **Seed dependent**: Quality varies significantly with random seed - **No conditioning**: Cannot specify style/artist (planned for v2) - **Pattern matching**: Learns techniques, not "composing" ## Training Details ``` Loss progression: Iter 0: 2.88 (random) Iter 200: 0.96 (structure learned) Iter 700: 0.37 (musical patterns) Iter 1000: 0.27 (kick drums, PWM) Iter 2000: 0.21 (best checkpoint) ``` Training was stopped at iter 2000 when validation loss plateaued and train/val gap exceeded 30% (indicating overfitting). ## Technical Details ### Data Format Each frame is 25 SID registers encoded as 50 hex characters + newline: ``` B0080005410A306011C0064108200016800D41082000B4031F B0084005410A30601100074108200016C00D41082000B4031F ... ``` - 50 frames = 1 second of audio - Vocabulary: `0-9`, `A-F`, `<`, `>`, `d`, `e`, `n`, `\n` (22 tokens) ### Inference Optimizations The Zig engine includes: - **KV Cache**: 50-100x speedup for autoregressive generation - **SIMD**: @Vector(8, f32) operations, 24x speedup - **Sliding Window**: Infinite generation beyond context length ## Citation ```bibtex @misc{sidgpt2026, author = {Mario Schallner}, title = {SID-GPT: Transformer-based Commodore 64 Music Generation}, year = {2026}, publisher = {Hugging Face}, url = {https://huggingface.co/M64/sid-gpt-25m} } ``` ## Links - [GitHub Repository](https://github.com/M64GitHub/SidGPT) - [Training Dataset](https://huggingface.co/datasets/M64/sid-music) 1GB training data (2,410 SID files) - [HVSC - Training Data Source](https://hvsc.c64.org/) - [Blog Post](#) *(coming soon)* ## Acknowledgments Thanks to the legendary C64 composers whose work made this possible: Matt Gray, Jeroen Tel, Rob Hubbard, Martin Galway, DRAX, Laxity, and all contributors to HVSC.