File size: 4,928 Bytes

---
license: mit
language:
- en
tags:
- music
- audio
- commodore-64
- sid
- chiptune
- generative
- gpt2
- transformer
---

# SID-GPT 25M

A GPT model trained to generate Commodore 64 SID music by learning from legendary composers.

**[Listen to samples](#audio-samples)** | **[GitHub](https://github.com/M64GitHub/SidGPT)**

## Model Description

SID-GPT learns to predict SID register states frame-by-frame, essentially learning the "language" of C64 chiptune music. Trained on 2,410 songs from HVSC, it produces output with recognizable musical structures: kick drums, PWM sweeps, basslines, and arpeggios.

| Parameter | Value |
|-----------|-------|
| Parameters | 25.7M |
| Architecture | 8 layers, 8 heads, 512 embedding |
| Block Size | 1020 tokens (20 frames) |
| Effective Context | 12 frames (0.24 sec) |
| Vocabulary | 22 tokens |
| Validation Loss | 0.207 |
| Training Time | 31 hours on M4 MacBook |

## Training Data

- **Source**: [HVSC](https://hvsc.c64.org/) (High Voltage SID Collection)
- **Size**: 1GB of register dump sequences (2,410 SID files)
- **Composers**: DRAX (530 songs), Laxity (287), Rob Hubbard (96), Jeroen Tel (176), Martin Galway (40), and 10 others

## Files

| File | Size | Description |
|------|------|-------------|
| `sid-gpt-xxxx.bin` | 98 MB | Exported weights for Zig inference |
| `sid-gpt-xxxx.pt` | 295 MB | PyTorch checkpoint (includes optimizer state) |
| `config.json` | 1 KB | Model configuration |

## Usage

### Zig Inference Engine (Recommended)

The native Zig engine runs at ~350-120 tok/s with SIMD and KV caching, depending on context window:
```bash
# Clone repository
git clone https://github.com/M64GitHub/SidGPT
cd SidGPT
zig build -Doptimize=ReleaseFast

# Download model
wget https://huggingface.co/M64/sid-gpt-25m/resolve/main/sid-gpt-1700.bin -P models/

# Generate and play
./zig-out/bin/sidgpt --model models/sid-gpt-1700.bin --frames 700 --temp 0.90 --seed 7391738265 --context 12 | ./zig-out/bin/sidgpt-play

# Or export to WAV
./zig-out/bin/sidgpt --model models/sid-gpt-1700.bin --frames 700 --temp 0.90 --seed 7391738265 --context 12 --output music.txt
./zig-out/bin/sidgpt-play music.txt --output-wav music.wav
```

### Python Inference
```bash
cd training
python sample_sid.py --checkpoint path/to/sid-gpt-1700.pt --num_frames 700 --temperature 0.95
```

## Generation Tips

**Good seeds to try**: 1337, 7391738264, 7391738265, 4829173650

## Audio Samples

Generated outputs from this model:

| Sample | Seed | Temp | Description |
|--------|------|------|-------------|
| [test.wav](samples/test.wav) | 7391738265 | 0.95 | Melodic arps with bassline and kicks |

## Proof of Concept Status

Despite only 12 frames (0.24 sec) of context, the model learned real SID techniques:

- **Kick drums** - Pulse wave frequency sweeps transitioning to noise
- **PWM sweeps** - Pulse width modulation fades (Rob Hubbard signature)
- **Basslines** - Melodic bass patterns with movement
- **Arpeggios** - Fast note sequences typical of SID music
- **Leads** - Fading-in lead voices

## Limitations

- **Short context**: 12 frames = no long-range song structure
- **Seed dependent**: Quality varies significantly with random seed
- **No conditioning**: Cannot specify style/artist (planned for v2)
- **Pattern matching**: Learns techniques, not "composing"

## Training Details
```
Loss progression:
  Iter 0:     2.88 (random)
  Iter 200:   0.96 (structure learned)
  Iter 700:   0.37 (musical patterns)
  Iter 1000:  0.27 (kick drums, PWM)
  Iter 2000:  0.21 (best checkpoint)
```

Training was stopped at iter 2000 when validation loss plateaued and train/val gap exceeded 30% (indicating overfitting).

## Technical Details

### Data Format

Each frame is 25 SID registers encoded as 50 hex characters + newline:
```
B0080005410A306011C0064108200016800D41082000B4031F
B0084005410A30601100074108200016C00D41082000B4031F
...
<end>
```

- 50 frames = 1 second of audio
- Vocabulary: `0-9`, `A-F`, `<`, `>`, `d`, `e`, `n`, `\n` (22 tokens)

### Inference Optimizations

The Zig engine includes:
- **KV Cache**: 50-100x speedup for autoregressive generation
- **SIMD**: @Vector(8, f32) operations, 24x speedup
- **Sliding Window**: Infinite generation beyond context length

## Citation
```bibtex
@misc{sidgpt2026,
  author = {Mario Schallner},
  title = {SID-GPT: Transformer-based Commodore 64 Music Generation},
  year = {2026},
  publisher = {Hugging Face},
  url = {https://huggingface.co/M64/sid-gpt-25m}
}
```

## Links

- [GitHub Repository](https://github.com/M64GitHub/SidGPT)
- [Training Dataset](https://huggingface.co/datasets/M64/sid-music) 1GB training data (2,410 SID files)
- [HVSC - Training Data Source](https://hvsc.c64.org/)
- [Blog Post](#) *(coming soon)*

## Acknowledgments

Thanks to the legendary C64 composers whose work made this possible: Matt Gray, Jeroen Tel, Rob Hubbard, Martin Galway, DRAX, Laxity, and all contributors to HVSC.