--- license: mit library_name: onnx pipeline_tag: other tags: - flappy-bird - world-model - onnx - webgpu - game --- # 🐦 Dreaming Bird — a neural Flappy Bird world model A small decoder-only transformer trained to **be** Flappy Bird. Given the recent frames and the player's action (flap / no-flap), it emits the next frame's quantized state. Looped autoregressively at ~30–60 fps, the model **replaces the physics engine** — there is no game logic at runtime; the network *is* the physics. 🎮 **Play it in your browser:** https://gmmeyer.github.io/gpt-bird/  (the page fetches `model.onnx` from this repo and runs it client-side via onnxruntime-web + WebGPU) 💻 **Source:** https://github.com/gmmeyer/gptbird ## Files | File | What | |---|---| | `model.onnx` | The ~11M-param "small" model (full game with pipes). ONNX opset 17, dynamic sequence length, returns last-position logits. This is what the web demo runs. | | `config.json` | Tokenizer offsets, quantization bins, and engine geometry — everything the JS decode loop and renderer need. | | `small_pipes.pt` | PyTorch checkpoint for `model.onnx` (n_layer 6, d_model 384, ctx 256 frames). | | `nano_nopipes.pt` | PyTorch checkpoint for the ~1.9M-param Phase-2 model (bird + gravity + flap, no pipes). | ## How it works Each frame is **4 tokens** — `bird_y, pipe_dx, gap_y, status` — generated with **slot-constrained decoding**: logits are masked to the legal id range for each field, so a malformed frame is impossible by construction. The `gap_y` slot is **sampled** (a newly revealed pipe's gap is genuinely unpredictable, so the dream invents one); the other slots are greedy. **Velocity is never in the observed state** — if the bird flies well, the model must have reconstructed velocity from the *sequence* of positions (finite differences), which is the central experiment. Quantization (bird_y → 128 bins) doubles as a drift stabilizer. ## Results - One-step `bird_y`: **98.6% exact / 100% within ±1 bin** (held-out). - `gap_y`: 99.96% exact on stable frames; **100% validity** on RNG spawn frames. - Collisions: **94.9% within ±1 frame** of the true engine. - Free-rollout drift: bird_y never exceeded 3 bins over 16 runs (mean 0.49); a controller playing *inside the dream* threads **5+ pipes**. - Cacheless dreaming runs at **~125 fps** on an RTX 5090. ## License MIT.