Upload folder using huggingface_hub
Browse files- README.md +160 -3
- ckpt_2000.pt +3 -0
- config.json +12 -0
- sid-gpt-25M.bin +3 -0
README.md
CHANGED
|
@@ -1,3 +1,160 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: mit
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
language:
|
| 4 |
+
- en
|
| 5 |
+
tags:
|
| 6 |
+
- music
|
| 7 |
+
- audio
|
| 8 |
+
- commodore-64
|
| 9 |
+
- sid
|
| 10 |
+
- chiptune
|
| 11 |
+
- generative
|
| 12 |
+
- gpt2
|
| 13 |
+
- transformer
|
| 14 |
+
---
|
| 15 |
+
|
| 16 |
+
# SID-GPT 25M
|
| 17 |
+
|
| 18 |
+
A GPT model trained to generate Commodore 64 SID music by learning from legendary composers.
|
| 19 |
+
|
| 20 |
+
**[Listen to samples](#audio-samples)** | **[GitHub](https://github.com/M64GitHub/SidGPT)**
|
| 21 |
+
|
| 22 |
+
## Model Description
|
| 23 |
+
|
| 24 |
+
SID-GPT learns to predict SID register states frame-by-frame, essentially learning the "language" of C64 chiptune music. Trained on 2,410 songs from HVSC, it produces output with recognizable musical structures: kick drums, PWM sweeps, basslines, and arpeggios.
|
| 25 |
+
|
| 26 |
+
| Parameter | Value |
|
| 27 |
+
|-----------|-------|
|
| 28 |
+
| Parameters | 25.7M |
|
| 29 |
+
| Architecture | 8 layers, 8 heads, 512 embedding |
|
| 30 |
+
| Block Size | 1020 tokens (20 frames) |
|
| 31 |
+
| Effective Context | 12 frames (0.24 sec) |
|
| 32 |
+
| Vocabulary | 22 tokens |
|
| 33 |
+
| Validation Loss | 0.207 |
|
| 34 |
+
| Training Time | 31 hours on M4 MacBook |
|
| 35 |
+
|
| 36 |
+
## Training Data
|
| 37 |
+
|
| 38 |
+
- **Source**: [HVSC](https://hvsc.c64.org/) (High Voltage SID Collection)
|
| 39 |
+
- **Size**: 1GB of register dump sequences (2,410 SID files)
|
| 40 |
+
- **Composers**: DRAX (530 songs), Laxity (287), Rob Hubbard (96), Jeroen Tel (176), Martin Galway (40), and 10 others
|
| 41 |
+
|
| 42 |
+
## Files
|
| 43 |
+
|
| 44 |
+
| File | Size | Description |
|
| 45 |
+
|------|------|-------------|
|
| 46 |
+
| `sid-gpt.bin` | 98 MB | Exported weights for Zig inference |
|
| 47 |
+
| `ckpt_2000.pt` | 295 MB | PyTorch checkpoint (includes optimizer state) |
|
| 48 |
+
| `config.json` | 1 KB | Model configuration |
|
| 49 |
+
|
| 50 |
+
## Usage
|
| 51 |
+
|
| 52 |
+
### Zig Inference Engine (Recommended)
|
| 53 |
+
|
| 54 |
+
The native Zig engine runs at ~350-120 tok/s with SIMD and KV caching, depending on context window:
|
| 55 |
+
```bash
|
| 56 |
+
# Clone repository
|
| 57 |
+
git clone https://github.com/M64GitHub/SidGPT
|
| 58 |
+
cd SidGPT
|
| 59 |
+
zig build -Doptimize=ReleaseFast
|
| 60 |
+
|
| 61 |
+
# Download model
|
| 62 |
+
wget https://huggingface.co/M64/sid-gpt-25m/resolve/main/sid-gpt.bin -P models/
|
| 63 |
+
|
| 64 |
+
# Generate and play
|
| 65 |
+
./zig-out/bin/sidgpt --model models/sid-gpt.bin --frames 700 --temp 0.90 --seed 7391738265 --context 12 | ./zig-out/bin/sidgpt-play
|
| 66 |
+
|
| 67 |
+
# Or export to WAV
|
| 68 |
+
./zig-out/bin/sidgpt --model models/sid-gpt.bin --frames 700 --temp 0.90 --seed 7391738265 --context 12 --output music.txt
|
| 69 |
+
./zig-out/bin/sidgpt-play music.txt --output-wav music.wav
|
| 70 |
+
```
|
| 71 |
+
|
| 72 |
+
### Python Inference
|
| 73 |
+
```bash
|
| 74 |
+
cd training
|
| 75 |
+
python sample_sid.py --checkpoint path/to/ckpt_2000.pt --num_frames 700 --temperature 0.95
|
| 76 |
+
```
|
| 77 |
+
|
| 78 |
+
## Generation Tips
|
| 79 |
+
|
| 80 |
+
**Good seeds to try**: 1337, 7391738264, 7391738265, 4829173650
|
| 81 |
+
|
| 82 |
+
## Audio Samples
|
| 83 |
+
|
| 84 |
+
Generated outputs from this model:
|
| 85 |
+
|
| 86 |
+
| Sample | Seed | Temp | Description |
|
| 87 |
+
|--------|------|------|-------------|
|
| 88 |
+
| [test.wav](samples/test.wav) | 7391738265 | 0.95 | Melodic arps with bassline and kicks |
|
| 89 |
+
|
| 90 |
+
## What It Learned
|
| 91 |
+
|
| 92 |
+
Despite only 12 frames (0.24 sec) of context, the model learned real SID techniques:
|
| 93 |
+
|
| 94 |
+
- **Kick drums** - Pulse wave frequency sweeps transitioning to noise
|
| 95 |
+
- **PWM sweeps** - Pulse width modulation fades (Rob Hubbard signature)
|
| 96 |
+
- **Basslines** - Melodic bass patterns with movement
|
| 97 |
+
- **Arpeggios** - Fast note sequences typical of SID music
|
| 98 |
+
- **Leads** - Fading-in lead voices
|
| 99 |
+
|
| 100 |
+
## Limitations
|
| 101 |
+
|
| 102 |
+
- **Short context**: 12 frames = no long-range song structure
|
| 103 |
+
- **Seed dependent**: Quality varies significantly with random seed
|
| 104 |
+
- **No conditioning**: Cannot specify style/artist (planned for v2)
|
| 105 |
+
- **Pattern matching**: Learns techniques, not "composing"
|
| 106 |
+
|
| 107 |
+
## Training Details
|
| 108 |
+
```
|
| 109 |
+
Loss progression:
|
| 110 |
+
Iter 0: 2.88 (random)
|
| 111 |
+
Iter 200: 0.96 (structure learned)
|
| 112 |
+
Iter 700: 0.37 (musical patterns)
|
| 113 |
+
Iter 1000: 0.27 (kick drums, PWM)
|
| 114 |
+
Iter 2000: 0.21 (best checkpoint)
|
| 115 |
+
```
|
| 116 |
+
|
| 117 |
+
Training was stopped at iter 2000 when validation loss plateaued and train/val gap exceeded 30% (indicating overfitting).
|
| 118 |
+
|
| 119 |
+
## Technical Details
|
| 120 |
+
|
| 121 |
+
### Data Format
|
| 122 |
+
|
| 123 |
+
Each frame is 25 SID registers encoded as 50 hex characters + newline:
|
| 124 |
+
```
|
| 125 |
+
B0080005410A306011C0064108200016800D41082000B4031F
|
| 126 |
+
B0084005410A30601100074108200016C00D41082000B4031F
|
| 127 |
+
...
|
| 128 |
+
<end>
|
| 129 |
+
```
|
| 130 |
+
|
| 131 |
+
- 50 frames = 1 second of audio
|
| 132 |
+
- Vocabulary: `0-9`, `A-F`, `<`, `>`, `d`, `e`, `n`, `\n` (22 tokens)
|
| 133 |
+
|
| 134 |
+
### Inference Optimizations
|
| 135 |
+
|
| 136 |
+
The Zig engine includes:
|
| 137 |
+
- **KV Cache**: 50-100x speedup for autoregressive generation
|
| 138 |
+
- **SIMD**: @Vector(8, f32) operations, 24x speedup
|
| 139 |
+
- **Sliding Window**: Infinite generation beyond context length
|
| 140 |
+
|
| 141 |
+
## Citation
|
| 142 |
+
```bibtex
|
| 143 |
+
@misc{sidgpt2025,
|
| 144 |
+
author = {Mario Smokovic},
|
| 145 |
+
title = {SID-GPT: Transformer-based Commodore 64 Music Generation},
|
| 146 |
+
year = {2025},
|
| 147 |
+
publisher = {Hugging Face},
|
| 148 |
+
url = {https://huggingface.co/M64/sid-gpt-25m}
|
| 149 |
+
}
|
| 150 |
+
```
|
| 151 |
+
|
| 152 |
+
## Links
|
| 153 |
+
|
| 154 |
+
- [GitHub Repository](https://github.com/M64GitHub/SidGPT)
|
| 155 |
+
- [HVSC - Training Data Source](https://hvsc.c64.org/)
|
| 156 |
+
- [Blog Post](#) *(coming soon)*
|
| 157 |
+
|
| 158 |
+
## Acknowledgments
|
| 159 |
+
|
| 160 |
+
Thanks to the legendary C64 composers whose work made this possible: Matt Gray, Jeroen Tel, Rob Hubbard, Martin Galway, DRAX, Laxity, and all contributors to HVSC.
|
ckpt_2000.pt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:1c09bfcb4e8bb6e26c17001b01ad2512fdfc1d0964c1a50419c2b2c50bcfdf82
|
| 3 |
+
size 308551020
|
config.json
ADDED
|
@@ -0,0 +1,12 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"model_type": "gpt2",
|
| 3 |
+
"n_layer": 8,
|
| 4 |
+
"n_head": 8,
|
| 5 |
+
"n_embd": 512,
|
| 6 |
+
"block_size": 1020,
|
| 7 |
+
"vocab_size": 22,
|
| 8 |
+
"bias": false,
|
| 9 |
+
"parameters": 25719296,
|
| 10 |
+
"val_loss": 0.207
|
| 11 |
+
}
|
| 12 |
+
|
sid-gpt-25M.bin
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d5f1a2a5a9457e5ac97c7e6a11af3df081a170f80b722d0ac468e13af6065f25
|
| 3 |
+
size 102879549
|