fix: re-judge experiment rollouts after credit-exhaustion; retry logic in batched + single-criterion judge 8410720 verified TheUnicat commited on 12 days ago
feat: V₀=0.5 baseline + det ceilings, prompt-pill UI, 60 experiment rollouts 2cd2802 verified TheUnicat commited on 12 days ago
feat: per-turn state value + turn score on Demo (state_v1 trajectories baked into 228 rollouts) e7cdebc verified TheUnicat commited on 12 days ago