Commit History

Phase 1 engine audit: ENGINE_AUDIT.md + bench-side closures
c634971

yxc20098 commited on

Quality drive: schema fix, 5 new/revised packs, 4 engine tests, scenario audit
6d71d3b

yxc20098 commited on

Engine-feature integration: 4 commands + 9 scenario packs + 4 test suites
20960c1

yxc20098 commited on

Add perception ablation grid (observation channel × fog of war)
4a5b0dd

yxc20098 commited on

Phase 1: unified Controller interface for the eval stack
c68e036

yxc20098 commited on

botgen: surface scripted opponents for YAML; drop premature capture_actor
ca911f6

yxc20098 commited on

Scenario configs (level+fog per cell) + adversarial-duel duel/interrupts + clearer names
f244b78

yxc20098 commited on

Structured-fog text mode, premium routing, codex descriptions, minimap colour-by-difficulty
93ee9dd

yxc20098 commited on

Wire bench to vendored training prompt v2 (system/briefing/minimap)
8e88074

yxc20098 commited on

Training-parity minimap (real terrain + legend) + viewer (system/thinking/debrief)
39fba02

yxc20098 commited on

Live-smoke fixes: tool-call wire 400, episode resilience, real PNG minimap
247ff7a

yxc20098 commited on

Deterministic scenario-scoped game knowledge + explicit objective
049448a

yxc20098 commited on

S8: capture_actor tool schema + dispatch; wildcard count 21->22
18612b1

yxc20098 commited on

Eval resilience layer for real OpenRouter runs
424da31

yxc20098 commited on

cargo tools: enter_transport + unload, 1:1 congruence
755ab44

yxc20098 commited on

Unified Battle Viewer in app.py + run/model playback identity
0a488d3

yxc20098 commited on

guard tool: 1:1 congruence with engine Command.guard
18d038a

yxc20098 commited on

Playback: capture model reasoning + per-turn goal tracker + viewer
f77eea7

yxc20098 commited on

Scenario-controlled tool allow/deny + default core set
f912cfc

yxc20098 commited on

S7 bench: set_stance + patrol tools (schema 1:1, 17==17)
eca1d53

yxc20098 commited on

S7 bench: surrender tool + loss outcome (tool schema 1:1, 15==15)
09ac234

yxc20098 commited on

Building & Planning scenario family (user-specified first set)
a919131

yxc20098 commited on

Bench: consume S9 economy obs + economy win-conditions + full toolset
5a1cf72

yxc20098 commited on

Add provider-agnostic model agent (vLLM/OpenRouter/Bedrock)
715cbbc

yxc20098 commited on