--- license: apache-2.0 tags: - game-ai - flow-matching - action-prediction - elden-ring - vla base_model: Qwen/Qwen3.5-4B --- # Pi-Lumine 4B — Flow-Matching Action Decoder for Elden Ring A Pi0.5-style flow-matching action decoder trained on top of a frozen Qwen3.5-4B VLM backbone. ## Architecture - **Base VLM**: Qwen/Qwen3.5-4B (frozen, not included — downloaded at runtime) - **Action Decoder**: FiLM-conditioned transformer with cross-attention to VLM hidden states - 2 decoder layers, VLM dim 2560 → decoder dim 1024, 8 attention heads - Projection layers decouple decoder from VLM hidden size - Instruction-conditioned via AdaptiveRMSNorm (FiLM) - Sinusoidal time embedding for flow matching - ~64M trainable parameters - **Action Space**: 6 steps x 20 dims (4 sticks + 16 buttons per step) - **Training**: Flow matching with Euler ODE integration at inference ## Files - `action_decoder.pt` — Trained action decoder weights - `decoder_config.json` — Architecture and tokenizer config - `tokenizer.json` / `tokenizer_config.json` — Tokenizer with special tokens - `chat_template.jinja` — Chat template - `processor_config.json` — Processor config