Upload folder using huggingface_hub

Browse files

Files changed (4) hide show

README.md +160 -3
ckpt_2000.pt +3 -0
config.json +12 -0
sid-gpt-25M.bin +3 -0

README.md CHANGED Viewed

@@ -1,3 +1,160 @@
----
-license: mit
----

+---
+license: mit
+language:
+- en
+tags:
+- music
+- audio
+- commodore-64
+- sid
+- chiptune
+- generative
+- gpt2
+- transformer
+---
+# SID-GPT 25M
+A GPT model trained to generate Commodore 64 SID music by learning from legendary composers.
+**[Listen to samples](#audio-samples)** | **[GitHub](https://github.com/M64GitHub/SidGPT)**
+## Model Description
+SID-GPT learns to predict SID register states frame-by-frame, essentially learning the "language" of C64 chiptune music. Trained on 2,410 songs from HVSC, it produces output with recognizable musical structures: kick drums, PWM sweeps, basslines, and arpeggios.
+| Parameter | Value |
+|-----------|-------|
+| Parameters | 25.7M |
+| Architecture | 8 layers, 8 heads, 512 embedding |
+| Block Size | 1020 tokens (20 frames) |
+| Effective Context | 12 frames (0.24 sec) |
+| Vocabulary | 22 tokens |
+| Validation Loss | 0.207 |
+| Training Time | 31 hours on M4 MacBook |
+## Training Data
+- **Source**: [HVSC](https://hvsc.c64.org/) (High Voltage SID Collection)
+- **Size**: 1GB of register dump sequences (2,410 SID files)
+- **Composers**: DRAX (530 songs), Laxity (287), Rob Hubbard (96), Jeroen Tel (176), Martin Galway (40), and 10 others
+## Files
+| File | Size | Description |
+|------|------|-------------|
+| `sid-gpt.bin` | 98 MB | Exported weights for Zig inference |
+| `ckpt_2000.pt` | 295 MB | PyTorch checkpoint (includes optimizer state) |
+| `config.json` | 1 KB | Model configuration |
+## Usage
+### Zig Inference Engine (Recommended)
+The native Zig engine runs at ~350-120 tok/s with SIMD and KV caching, depending on context window:
+```bash
+# Clone repository
+git clone https://github.com/M64GitHub/SidGPT
+cd SidGPT
+zig build -Doptimize=ReleaseFast
+# Download model
+wget https://huggingface.co/M64/sid-gpt-25m/resolve/main/sid-gpt.bin -P models/
+# Generate and play
+./zig-out/bin/sidgpt --model models/sid-gpt.bin --frames 700 --temp 0.90 --seed 7391738265 --context 12 | ./zig-out/bin/sidgpt-play
+# Or export to WAV
+./zig-out/bin/sidgpt --model models/sid-gpt.bin --frames 700 --temp 0.90 --seed 7391738265 --context 12 --output music.txt
+./zig-out/bin/sidgpt-play music.txt --output-wav music.wav
+```
+### Python Inference
+```bash
+cd training
+python sample_sid.py --checkpoint path/to/ckpt_2000.pt --num_frames 700 --temperature 0.95
+```
+## Generation Tips
+**Good seeds to try**: 1337, 7391738264, 7391738265, 4829173650
+## Audio Samples
+Generated outputs from this model:
+| Sample | Seed | Temp | Description |
+|--------|------|------|-------------|
+| [test.wav](samples/test.wav) | 7391738265 | 0.95 | Melodic arps with bassline and kicks |
+## What It Learned
+Despite only 12 frames (0.24 sec) of context, the model learned real SID techniques:
+- **Kick drums** - Pulse wave frequency sweeps transitioning to noise
+- **PWM sweeps** - Pulse width modulation fades (Rob Hubbard signature)
+- **Basslines** - Melodic bass patterns with movement
+- **Arpeggios** - Fast note sequences typical of SID music
+- **Leads** - Fading-in lead voices
+## Limitations
+- **Short context**: 12 frames = no long-range song structure
+- **Seed dependent**: Quality varies significantly with random seed
+- **No conditioning**: Cannot specify style/artist (planned for v2)
+- **Pattern matching**: Learns techniques, not "composing"
+## Training Details
+```
+Loss progression:
+  Iter 0:     2.88 (random)
+  Iter 200:   0.96 (structure learned)
+  Iter 700:   0.37 (musical patterns)
+  Iter 1000:  0.27 (kick drums, PWM)
+  Iter 2000:  0.21 (best checkpoint)
+```
+Training was stopped at iter 2000 when validation loss plateaued and train/val gap exceeded 30% (indicating overfitting).
+## Technical Details
+### Data Format
+Each frame is 25 SID registers encoded as 50 hex characters + newline:
+```
+B0080005410A306011C0064108200016800D41082000B4031F
+B0084005410A30601100074108200016C00D41082000B4031F
+...
+<end>
+```
+- 50 frames = 1 second of audio
+- Vocabulary: `0-9`, `A-F`, `<`, `>`, `d`, `e`, `n`, `\n` (22 tokens)
+### Inference Optimizations
+The Zig engine includes:
+- **KV Cache**: 50-100x speedup for autoregressive generation
+- **SIMD**: @Vector(8, f32) operations, 24x speedup
+- **Sliding Window**: Infinite generation beyond context length
+## Citation
+```bibtex
+@misc{sidgpt2025,
+  author = {Mario Smokovic},
+  title = {SID-GPT: Transformer-based Commodore 64 Music Generation},
+  year = {2025},
+  publisher = {Hugging Face},
+  url = {https://huggingface.co/M64/sid-gpt-25m}
+}
+```
+## Links
+- [GitHub Repository](https://github.com/M64GitHub/SidGPT)
+- [HVSC - Training Data Source](https://hvsc.c64.org/)
+- [Blog Post](#) *(coming soon)*
+## Acknowledgments
+Thanks to the legendary C64 composers whose work made this possible: Matt Gray, Jeroen Tel, Rob Hubbard, Martin Galway, DRAX, Laxity, and all contributors to HVSC.

ckpt_2000.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1c09bfcb4e8bb6e26c17001b01ad2512fdfc1d0964c1a50419c2b2c50bcfdf82
+size 308551020

config.json ADDED Viewed

	@@ -0,0 +1,12 @@

+{
+  "model_type": "gpt2",
+  "n_layer": 8,
+  "n_head": 8,
+  "n_embd": 512,
+  "block_size": 1020,
+  "vocab_size": 22,
+  "bias": false,
+  "parameters": 25719296,
+  "val_loss": 0.207
+}

sid-gpt-25M.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d5f1a2a5a9457e5ac97c7e6a11af3df081a170f80b722d0ac468e13af6065f25
+size 102879549