---
license: apache-2.0
tags:
  - game-ai
  - flow-matching
  - action-prediction
  - elden-ring
  - vla
base_model: Qwen/Qwen3.5-4B
---

# Pi-Lumine 4B — Flow-Matching Action Decoder for Elden Ring

A Pi0.5-style flow-matching action decoder trained on top of a frozen Qwen3.5-4B VLM backbone.

## Architecture

- **Base VLM**: Qwen/Qwen3.5-4B (frozen, not included — downloaded at runtime)
- **Action Decoder**: FiLM-conditioned transformer with cross-attention to VLM hidden states
  - 2 decoder layers, VLM dim 2560 → decoder dim 1024, 8 attention heads
  - Projection layers decouple decoder from VLM hidden size
  - Instruction-conditioned via AdaptiveRMSNorm (FiLM)
  - Sinusoidal time embedding for flow matching
  - ~64M trainable parameters
- **Action Space**: 6 steps x 20 dims (4 sticks + 16 buttons per step)
- **Training**: Flow matching with Euler ODE integration at inference

## Files

- `action_decoder.pt` — Trained action decoder weights
- `decoder_config.json` — Architecture and tokenizer config
- `tokenizer.json` / `tokenizer_config.json` — Tokenizer with special tokens
- `chat_template.jinja` — Chat template
- `processor_config.json` — Processor config