File size: 4,928 Bytes
ab7e589
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d19d7af
 
ab7e589
 
 
 
 
 
 
 
 
 
 
 
 
 
d19d7af
ab7e589
 
d19d7af
ab7e589
 
d19d7af
ab7e589
 
 
 
 
 
d19d7af
ab7e589
 
 
 
 
 
 
 
 
 
 
 
 
 
4135f6d
ab7e589
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
bbc8b14
b779833
ab7e589
bbc8b14
ab7e589
 
 
 
 
 
 
 
1d747b9
ab7e589
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
---
license: mit
language:
- en
tags:
- music
- audio
- commodore-64
- sid
- chiptune
- generative
- gpt2
- transformer
---

# SID-GPT 25M

A GPT model trained to generate Commodore 64 SID music by learning from legendary composers.

**[Listen to samples](#audio-samples)** | **[GitHub](https://github.com/M64GitHub/SidGPT)**

## Model Description

SID-GPT learns to predict SID register states frame-by-frame, essentially learning the "language" of C64 chiptune music. Trained on 2,410 songs from HVSC, it produces output with recognizable musical structures: kick drums, PWM sweeps, basslines, and arpeggios.

| Parameter | Value |
|-----------|-------|
| Parameters | 25.7M |
| Architecture | 8 layers, 8 heads, 512 embedding |
| Block Size | 1020 tokens (20 frames) |
| Effective Context | 12 frames (0.24 sec) |
| Vocabulary | 22 tokens |
| Validation Loss | 0.207 |
| Training Time | 31 hours on M4 MacBook |

## Training Data

- **Source**: [HVSC](https://hvsc.c64.org/) (High Voltage SID Collection)
- **Size**: 1GB of register dump sequences (2,410 SID files)
- **Composers**: DRAX (530 songs), Laxity (287), Rob Hubbard (96), Jeroen Tel (176), Martin Galway (40), and 10 others

## Files

| File | Size | Description |
|------|------|-------------|
| `sid-gpt-xxxx.bin` | 98 MB | Exported weights for Zig inference |
| `sid-gpt-xxxx.pt` | 295 MB | PyTorch checkpoint (includes optimizer state) |
| `config.json` | 1 KB | Model configuration |

## Usage

### Zig Inference Engine (Recommended)

The native Zig engine runs at ~350-120 tok/s with SIMD and KV caching, depending on context window:
```bash
# Clone repository
git clone https://github.com/M64GitHub/SidGPT
cd SidGPT
zig build -Doptimize=ReleaseFast

# Download model
wget https://huggingface.co/M64/sid-gpt-25m/resolve/main/sid-gpt-1700.bin -P models/

# Generate and play
./zig-out/bin/sidgpt --model models/sid-gpt-1700.bin --frames 700 --temp 0.90 --seed 7391738265 --context 12 | ./zig-out/bin/sidgpt-play

# Or export to WAV
./zig-out/bin/sidgpt --model models/sid-gpt-1700.bin --frames 700 --temp 0.90 --seed 7391738265 --context 12 --output music.txt
./zig-out/bin/sidgpt-play music.txt --output-wav music.wav
```

### Python Inference
```bash
cd training
python sample_sid.py --checkpoint path/to/sid-gpt-1700.pt --num_frames 700 --temperature 0.95
```

## Generation Tips

**Good seeds to try**: 1337, 7391738264, 7391738265, 4829173650

## Audio Samples

Generated outputs from this model:

| Sample | Seed | Temp | Description |
|--------|------|------|-------------|
| [test.wav](samples/test.wav) | 7391738265 | 0.95 | Melodic arps with bassline and kicks |

## Proof of Concept Status

Despite only 12 frames (0.24 sec) of context, the model learned real SID techniques:

- **Kick drums** - Pulse wave frequency sweeps transitioning to noise
- **PWM sweeps** - Pulse width modulation fades (Rob Hubbard signature)
- **Basslines** - Melodic bass patterns with movement
- **Arpeggios** - Fast note sequences typical of SID music
- **Leads** - Fading-in lead voices

## Limitations

- **Short context**: 12 frames = no long-range song structure
- **Seed dependent**: Quality varies significantly with random seed
- **No conditioning**: Cannot specify style/artist (planned for v2)
- **Pattern matching**: Learns techniques, not "composing"

## Training Details
```
Loss progression:
  Iter 0:     2.88 (random)
  Iter 200:   0.96 (structure learned)
  Iter 700:   0.37 (musical patterns)
  Iter 1000:  0.27 (kick drums, PWM)
  Iter 2000:  0.21 (best checkpoint)
```

Training was stopped at iter 2000 when validation loss plateaued and train/val gap exceeded 30% (indicating overfitting).

## Technical Details

### Data Format

Each frame is 25 SID registers encoded as 50 hex characters + newline:
```
B0080005410A306011C0064108200016800D41082000B4031F
B0084005410A30601100074108200016C00D41082000B4031F
...
<end>
```

- 50 frames = 1 second of audio
- Vocabulary: `0-9`, `A-F`, `<`, `>`, `d`, `e`, `n`, `\n` (22 tokens)

### Inference Optimizations

The Zig engine includes:
- **KV Cache**: 50-100x speedup for autoregressive generation
- **SIMD**: @Vector(8, f32) operations, 24x speedup
- **Sliding Window**: Infinite generation beyond context length

## Citation
```bibtex
@misc{sidgpt2026,
  author = {Mario Schallner},
  title = {SID-GPT: Transformer-based Commodore 64 Music Generation},
  year = {2026},
  publisher = {Hugging Face},
  url = {https://huggingface.co/M64/sid-gpt-25m}
}
```

## Links

- [GitHub Repository](https://github.com/M64GitHub/SidGPT)
- [Training Dataset](https://huggingface.co/datasets/M64/sid-music) 1GB training data (2,410 SID files)
- [HVSC - Training Data Source](https://hvsc.c64.org/)
- [Blog Post](#) *(coming soon)*

## Acknowledgments

Thanks to the legendary C64 composers whose work made this possible: Matt Gray, Jeroen Tel, Rob Hubbard, Martin Galway, DRAX, Laxity, and all contributors to HVSC.