| license: apache-2.0 | |
| tags: | |
| - game-ai | |
| - flow-matching | |
| - action-prediction | |
| - elden-ring | |
| - vla | |
| base_model: Qwen/Qwen3.5-4B | |
| # Pi-Lumine 4B β Flow-Matching Action Decoder for Elden Ring | |
| A Pi0.5-style flow-matching action decoder trained on top of a frozen Qwen3.5-4B VLM backbone. | |
| ## Architecture | |
| - **Base VLM**: Qwen/Qwen3.5-4B (frozen, not included β downloaded at runtime) | |
| - **Action Decoder**: FiLM-conditioned transformer with cross-attention to VLM hidden states | |
| - 2 decoder layers, VLM dim 2560 β decoder dim 1024, 8 attention heads | |
| - Projection layers decouple decoder from VLM hidden size | |
| - Instruction-conditioned via AdaptiveRMSNorm (FiLM) | |
| - Sinusoidal time embedding for flow matching | |
| - ~64M trainable parameters | |
| - **Action Space**: 6 steps x 20 dims (4 sticks + 16 buttons per step) | |
| - **Training**: Flow matching with Euler ODE integration at inference | |
| ## Files | |
| - `action_decoder.pt` β Trained action decoder weights | |
| - `decoder_config.json` β Architecture and tokenizer config | |
| - `tokenizer.json` / `tokenizer_config.json` β Tokenizer with special tokens | |
| - `chat_template.jinja` β Chat template | |
| - `processor_config.json` β Processor config | |