gpt-bird / README.md
gmmeyer's picture
model card
51eabd5 verified
---
license: mit
library_name: onnx
pipeline_tag: other
tags:
- flappy-bird
- world-model
- onnx
- webgpu
- game
---
# ๐Ÿฆ Dreaming Bird โ€” a neural Flappy Bird world model
A small decoder-only transformer trained to **be** Flappy Bird. Given the recent frames and the
player's action (flap / no-flap), it emits the next frame's quantized state. Looped
autoregressively at ~30โ€“60 fps, the model **replaces the physics engine** โ€” there is no game
logic at runtime; the network *is* the physics.
๐ŸŽฎ **Play it in your browser:** https://gmmeyer.github.io/gpt-bird/  (the page fetches
`model.onnx` from this repo and runs it client-side via onnxruntime-web + WebGPU)
๐Ÿ’ป **Source:** https://github.com/gmmeyer/gptbird
## Files
| File | What |
|---|---|
| `model.onnx` | The ~11M-param "small" model (full game with pipes). ONNX opset 17, dynamic sequence length, returns last-position logits. This is what the web demo runs. |
| `config.json` | Tokenizer offsets, quantization bins, and engine geometry โ€” everything the JS decode loop and renderer need. |
| `small_pipes.pt` | PyTorch checkpoint for `model.onnx` (n_layer 6, d_model 384, ctx 256 frames). |
| `nano_nopipes.pt` | PyTorch checkpoint for the ~1.9M-param Phase-2 model (bird + gravity + flap, no pipes). |
## How it works
Each frame is **4 tokens** โ€” `bird_y, pipe_dx, gap_y, status` โ€” generated with **slot-constrained
decoding**: logits are masked to the legal id range for each field, so a malformed frame is
impossible by construction. The `gap_y` slot is **sampled** (a newly revealed pipe's gap is
genuinely unpredictable, so the dream invents one); the other slots are greedy.
**Velocity is never in the observed state** โ€” if the bird flies well, the model must have
reconstructed velocity from the *sequence* of positions (finite differences), which is the
central experiment. Quantization (bird_y โ†’ 128 bins) doubles as a drift stabilizer.
## Results
- One-step `bird_y`: **98.6% exact / 100% within ยฑ1 bin** (held-out).
- `gap_y`: 99.96% exact on stable frames; **100% validity** on RNG spawn frames.
- Collisions: **94.9% within ยฑ1 frame** of the true engine.
- Free-rollout drift: bird_y never exceeded 3 bins over 16 runs (mean 0.49); a controller
playing *inside the dream* threads **5+ pipes**.
- Cacheless dreaming runs at **~125 fps** on an RTX 5090.
## License
MIT.