Stack Doctor
Incident War Room
Scenario
Episode
00:00
Standby
Training Analytics
Qwen3.5-9B — Episode Reward
100 GRPO steps — base model already near-oracle
Peak
+26.00
Base Avg
+19.50
Zero-Std
72%
Qwen3.5-9B — Completion Length
Thinking mode consumed token budget, hit 2048 cap
Collapse
Step 36
Clipping
Step 69
Qwen2.5-1.5B — Episode Reward
16 GRPO steps — weak model, real gradient signal
Best Step
-1.75
Avg
-4.90
Zero-Std
0%
Live Environment
Model: checking... Disconnected
Inference Stack
Model
Kernel
Attention / GEMM
Backend
Runtime
CUDA / ROCm
Memory
HBM / KV Cache
Driver
Investigation Log
Awaiting incident assignment.
Click Run Demo above to start.
Specialist Agents
Steps
Reward
0.00
Fix Status
Not Applied
Root Cause
Diagnosis
Pending
Diagnosis Submitted
+0.00
Episode Reward