Spaces:
Running
fix(scenario): combat-vehicle-vs-infantry-counter — restore no-cheat bar after armor-class engine fix
Browse filesThe OpenRA-Rust armor-class engine fix (4d91fe0) made pre-placed agent
combat units auto-fire effectively. The starter scout jeep then racked
up kills on its own, so a pure-observe `stall` policy reached the kill
bar and WON — violating the no-cheat bar.
Restore the bar:
- Starter jeep set to `stance: 0` (HoldFire) on every level / spawn
group — it scouts, it never auto-fires, so a stall policy scores
zero kills.
- Win predicate gains `unit_type_count_gte 2tnk:3` — the agent must
ACTUALLY field the 3-tank fist. Stall and wrong-counter (e3 / e1)
policies never build 2tnk → win clause structurally unmet.
- Fail clause gains `not own_units_gte:1` so a stalled-and-overrun
episode is a real LOSS, not an engine auto-`done` DRAW (the agent
starts with the jeep so the unit-less turn-1 mis-fire footgun does
not apply).
- Add `powr` + `fix` to every base — the war-factory vehicle queue
needs power online and a service depot for `2tnk` to clear its
prerequisites, so the tank counter is producible from turn 1.
Validated via scripted policies on 3 levels x seeds 1..4:
stall / build-e3 / build-e1 LOSE everywhere (real LOSS, no DRAW);
intended build-2tnk WINS everywhere.
|
@@ -21,33 +21,43 @@
|
|
| 21 |
# Pre-placed (each spawn group): agent `fact` + `tent` (infantry
|
| 22 |
# trainer; enables e1 and e3) + `weap` (vehicle trainer; enables
|
| 23 |
# 2tnk) + a single starter `jeep` (allies scout vehicle; visibility
|
| 24 |
-
# over the enemy composition
|
| 25 |
-
# so
|
| 26 |
-
#
|
| 27 |
-
#
|
| 28 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 29 |
#
|
| 30 |
# Discrimination on EASY / MEDIUM (single enemy cluster, both
|
| 31 |
-
# compositions are buildable from t=1)
|
| 32 |
-
#
|
| 33 |
-
#
|
| 34 |
-
#
|
| 35 |
-
#
|
| 36 |
-
#
|
| 37 |
-
#
|
| 38 |
-
#
|
| 39 |
-
# • build-only-
|
| 40 |
-
#
|
| 41 |
-
#
|
| 42 |
-
#
|
| 43 |
-
#
|
| 44 |
-
#
|
| 45 |
-
#
|
| 46 |
-
#
|
| 47 |
-
#
|
| 48 |
-
#
|
| 49 |
-
#
|
| 50 |
-
#
|
|
|
|
| 51 |
#
|
| 52 |
# Discrimination on HARD (+1 axis: 2 agent spawn_point groups):
|
| 53 |
# • The agent base seed-rotates between NORTH (y=12) and SOUTH
|
|
@@ -70,15 +80,31 @@
|
|
| 70 |
# • `after_ticks` fail clauses reachable within max_turns
|
| 71 |
# (within_ticks 5400 ≤ 5403 at max_turns 60): a staller hits
|
| 72 |
# after_ticks 5401 and LOSES, never draws.
|
| 73 |
-
# • Starting `jeep`
|
| 74 |
-
#
|
| 75 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 76 |
# • Spawn-group footgun (CLAUDE.md oramap): on hard, ANY agent
|
| 77 |
# actor with `spawn_point` filters OUT every agent actor without
|
| 78 |
-
# one — so BOTH bases (fact + tent + weap + jeep)
|
| 79 |
-
# duplicated under BOTH spawn_point groups at their
|
| 80 |
-
# coords. The single far enemy fact at (124,20) and
|
| 81 |
-
# enemy cluster have NO spawn_point and place on
|
|
|
|
| 82 |
# • starting_cash $2550 = exactly 3× 2tnk; 8× e3 = $2400 (cash
|
| 83 |
# unspent); 25× e1 = $2500. Neither rocket nor rifle mass is
|
| 84 |
# dominant against the entrenched enemy.
|
|
@@ -130,13 +156,15 @@ base:
|
|
| 130 |
planning: true
|
| 131 |
termination: {max_ticks: 8000}
|
| 132 |
# Default base (overridden on hard). The starter jeep gives turn-1
|
| 133 |
-
# scouting (sight 7c)
|
| 134 |
-
#
|
| 135 |
actors:
|
| 136 |
- {type: fact, owner: agent, position: [10, 20]}
|
|
|
|
| 137 |
- {type: tent, owner: agent, position: [14, 18]}
|
| 138 |
- {type: weap, owner: agent, position: [14, 22]}
|
| 139 |
-
- {type:
|
|
|
|
| 140 |
# Far persistent enemy marker — prevents engine auto-done when
|
| 141 |
# the live infantry cluster falls so the win/fail evaluator sees
|
| 142 |
# the terminal frame.
|
|
@@ -146,9 +174,9 @@ levels:
|
|
| 146 |
# ── EASY ─────────────────────────────────────────────────────────
|
| 147 |
# Bare counter-selection skill: a small visible enemy infantry
|
| 148 |
# cluster (8× e1) on the centre lane. Cash $2550 funds 3× 2tnk
|
| 149 |
-
# (the right counter) cleanly.
|
| 150 |
-
#
|
| 151 |
-
# unmet → after_ticks LOSS.
|
| 152 |
easy:
|
| 153 |
description: >
|
| 154 |
Cash $2550. The enemy is a small cluster of 8 rifle infantry
|
|
@@ -160,16 +188,18 @@ levels:
|
|
| 160 |
armoured fist (2tnk @ $850, exactly 3 for $2550) OR rifle
|
| 161 |
infantry (e1 @ $100, up to 25). Scout the enemy with the jeep,
|
| 162 |
pick the correct hard counter, and commit the whole budget.
|
| 163 |
-
Win when
|
| 164 |
-
|
| 165 |
-
5400. Stalling
|
| 166 |
-
|
| 167 |
overrides:
|
| 168 |
actors:
|
| 169 |
- {type: fact, owner: agent, position: [10, 20]}
|
|
|
|
| 170 |
- {type: tent, owner: agent, position: [14, 18]}
|
| 171 |
- {type: weap, owner: agent, position: [14, 22]}
|
| 172 |
-
- {type:
|
|
|
|
| 173 |
# 8× e1 entrenched cluster — stance:3 (attack anything in
|
| 174 |
# range). Spread over a 3×3 grid centred on (70,20) so the
|
| 175 |
# mass can be engaged from any approach axis.
|
|
@@ -184,15 +214,15 @@ levels:
|
|
| 184 |
- {type: fact, owner: enemy, position: [124, 20]}
|
| 185 |
win_condition:
|
| 186 |
all_of:
|
|
|
|
| 187 |
- {units_killed_gte: 6}
|
| 188 |
-
- {own_units_gte: 1}
|
| 189 |
- {has_building: fact}
|
| 190 |
- {within_ticks: 5400}
|
| 191 |
fail_condition:
|
| 192 |
any_of:
|
| 193 |
- {after_ticks: 5401}
|
| 194 |
-
- {not: {own_units_gte: 1}}
|
| 195 |
- {not: {has_building: fact}}
|
|
|
|
| 196 |
max_turns: 60
|
| 197 |
|
| 198 |
# ── MEDIUM ───────────────────────────────────────────────────────
|
|
@@ -214,15 +244,17 @@ levels:
|
|
| 214 |
through small-arms fire. Mass rockets (e3 @ $300) waste cost-
|
| 215 |
per-effect against soft targets and get out-DPSed by the rifle
|
| 216 |
mass on attrition; matching with own rifles (e1 @ $100) is a
|
| 217 |
-
1:1 trade with no advantage. Win when
|
| 218 |
-
|
| 219 |
-
before tick 5400.
|
| 220 |
overrides:
|
| 221 |
actors:
|
| 222 |
- {type: fact, owner: agent, position: [10, 20]}
|
|
|
|
| 223 |
- {type: tent, owner: agent, position: [14, 18]}
|
| 224 |
- {type: weap, owner: agent, position: [14, 22]}
|
| 225 |
-
- {type:
|
|
|
|
| 226 |
# 12× e1 entrenched cluster — deeper centre mass.
|
| 227 |
- {type: e1, owner: enemy, position: [70, 17], stance: 3}
|
| 228 |
- {type: e1, owner: enemy, position: [70, 18], stance: 3}
|
|
@@ -239,15 +271,15 @@ levels:
|
|
| 239 |
- {type: fact, owner: enemy, position: [124, 20]}
|
| 240 |
win_condition:
|
| 241 |
all_of:
|
|
|
|
| 242 |
- {units_killed_gte: 8}
|
| 243 |
-
- {own_units_gte: 1}
|
| 244 |
- {has_building: fact}
|
| 245 |
- {within_ticks: 5400}
|
| 246 |
fail_condition:
|
| 247 |
any_of:
|
| 248 |
- {after_ticks: 5401}
|
| 249 |
-
- {not: {own_units_gte: 1}}
|
| 250 |
- {not: {has_building: fact}}
|
|
|
|
| 251 |
max_turns: 60
|
| 252 |
|
| 253 |
# ── HARD ─────────────────────────────────────────────────────────
|
|
@@ -271,21 +303,25 @@ levels:
|
|
| 271 |
medium tanks (2tnk) walk through small-arms fire; mass rockets
|
| 272 |
(e3) waste cost-per-effect against soft targets and get out-
|
| 273 |
DPSed on attrition; matching with own rifles is a 1:1 trade
|
| 274 |
-
with no advantage. Win when
|
| 275 |
-
|
| 276 |
-
5400.
|
| 277 |
overrides:
|
| 278 |
actors:
|
| 279 |
# ── AGENT spawn 0 — NORTH base (y=12) ─────────────────────
|
| 280 |
- {type: fact, owner: agent, position: [10, 12], spawn_point: 0}
|
|
|
|
| 281 |
- {type: tent, owner: agent, position: [14, 10], spawn_point: 0}
|
| 282 |
- {type: weap, owner: agent, position: [14, 14], spawn_point: 0}
|
| 283 |
-
- {type:
|
|
|
|
| 284 |
# ── AGENT spawn 1 — SOUTH base (y=28) ─────────────────────
|
| 285 |
- {type: fact, owner: agent, position: [10, 28], spawn_point: 1}
|
|
|
|
| 286 |
- {type: tent, owner: agent, position: [14, 26], spawn_point: 1}
|
| 287 |
- {type: weap, owner: agent, position: [14, 30], spawn_point: 1}
|
| 288 |
-
- {type:
|
|
|
|
| 289 |
# ── CENTRE ENEMY CLUSTER — pure infantry (always places) ──
|
| 290 |
# 12× e1 entrenched on the central lane. Per CLAUDE.md, enemy
|
| 291 |
# actors do not honour spawn_point — this cluster lands on
|
|
@@ -307,13 +343,13 @@ levels:
|
|
| 307 |
- {type: fact, owner: enemy, position: [124, 20]}
|
| 308 |
win_condition:
|
| 309 |
all_of:
|
|
|
|
| 310 |
- {units_killed_gte: 8}
|
| 311 |
-
- {own_units_gte: 1}
|
| 312 |
- {has_building: fact}
|
| 313 |
- {within_ticks: 5400}
|
| 314 |
fail_condition:
|
| 315 |
any_of:
|
| 316 |
- {after_ticks: 5401}
|
| 317 |
-
- {not: {own_units_gte: 1}}
|
| 318 |
- {not: {has_building: fact}}
|
|
|
|
| 319 |
max_turns: 60
|
|
|
|
| 21 |
# Pre-placed (each spawn group): agent `fact` + `tent` (infantry
|
| 22 |
# trainer; enables e1 and e3) + `weap` (vehicle trainer; enables
|
| 23 |
# 2tnk) + a single starter `jeep` (allies scout vehicle; visibility
|
| 24 |
+
# over the enemy composition). The starter jeep is `stance: 0`
|
| 25 |
+
# (HoldFire) so it NEVER auto-fires — it scouts, it does not fight.
|
| 26 |
+
# This is the load-bearing anti-stall guard (CLAUDE.md: an armour-
|
| 27 |
+
# class engine fix made pre-placed agent combat units auto-fire
|
| 28 |
+
# effectively, so a HoldFire jeep cannot rack up kills on its own).
|
| 29 |
+
# The tent+weap pair makes BOTH counter compositions buildable from
|
| 30 |
+
# turn 1 — the decision the model faces is composition, not tech-up.
|
| 31 |
+
#
|
| 32 |
+
# The win predicate requires `unit_type_count_gte: 2tnk:3` — the
|
| 33 |
+
# agent must have ACTUALLY FIELDED 3 medium tanks (the right
|
| 34 |
+
# counter). A stall policy never builds tanks; a wrong-counter
|
| 35 |
+
# policy (e3 / e1) never builds 2tnk — neither can satisfy the win
|
| 36 |
+
# regardless of how many kills the entrenched enemy concedes. Only
|
| 37 |
+
# building + commanding the 3-tank fist clears the bar.
|
| 38 |
#
|
| 39 |
# Discrimination on EASY / MEDIUM (single enemy cluster, both
|
| 40 |
+
# compositions are buildable from t=1). The win predicate is
|
| 41 |
+
# `unit_type_count_gte 2tnk:3 AND units_killed_gte K AND
|
| 42 |
+
# has_building fact` — the 2tnk:3 clause means ONLY a policy that
|
| 43 |
+
# actually builds the 3-tank fist can clear the bar:
|
| 44 |
+
# • stall (only observe): builds nothing; the HoldFire jeep never
|
| 45 |
+
# fires → 0 kills, 0 tanks. The entrenched e1 swarm eventually
|
| 46 |
+
# hunts down the idle jeep → force-wipe (not own_units_gte:1)
|
| 47 |
+
# LOSS (or after_ticks LOSS if the jeep is never reached).
|
| 48 |
+
# • build-only-e1 (match enemy 1:1 with cheap rifles): never
|
| 49 |
+
# builds 2tnk → the 2tnk:3 clause is structurally unmet; even
|
| 50 |
+
# a full kill count cannot win → after_ticks / force-wipe LOSS.
|
| 51 |
+
# • build-only-e3 (anti-armour rockets against infantry — the
|
| 52 |
+
# wrong counter): never builds 2tnk → the 2tnk:3 clause is
|
| 53 |
+
# structurally unmet; the rocket squad is also OUT-SHOT by the
|
| 54 |
+
# e1 mass on raw infantry-vs-infantry numbers (e3 hp45 < e1
|
| 55 |
+
# hp50; slow anti-armour projectiles under-perform vs small
|
| 56 |
+
# targets) → after_ticks / force-wipe LOSS.
|
| 57 |
+
# • intended build-2tnk (3× medium tanks @ $850 = $2550): the
|
| 58 |
+
# right counter — heavy armour soaks small-arms fire, tank
|
| 59 |
+
# dps22 + rng 4.75 + sight 6c walks through a static e1 mass.
|
| 60 |
+
# 2tnk:3 clause met, kill bar met, fact intact → WIN.
|
| 61 |
#
|
| 62 |
# Discrimination on HARD (+1 axis: 2 agent spawn_point groups):
|
| 63 |
# • The agent base seed-rotates between NORTH (y=12) and SOUTH
|
|
|
|
| 80 |
# • `after_ticks` fail clauses reachable within max_turns
|
| 81 |
# (within_ticks 5400 ≤ 5403 at max_turns 60): a staller hits
|
| 82 |
# after_ticks 5401 and LOSES, never draws.
|
| 83 |
+
# • Starting `jeep` is `stance: 0` (HoldFire) — it scouts but
|
| 84 |
+
# never auto-fires, so a pure-observe stall policy cannot score
|
| 85 |
+
# kills with it. The win predicate's `unit_type_count_gte`
|
| 86 |
+
# 2tnk:3 is the real anti-cheat: only the agent that builds the
|
| 87 |
+
# 3-tank fist can clear the bar. The fail clause is
|
| 88 |
+
# `after_ticks | not has_building:fact | not own_units_gte:1` —
|
| 89 |
+
# the agent starts WITH the jeep so `own_units_gte:1` is
|
| 90 |
+
# satisfied from t=0 (the unit-less turn-1 mis-fire footgun in
|
| 91 |
+
# CLAUDE.md does not apply); the force-wipe clause turns a
|
| 92 |
+
# stalled-and-overrun episode into a real LOSS instead of an
|
| 93 |
+
# engine auto-`done` DRAW when the entrenched e1 swarm hunts
|
| 94 |
+
# down the idle HoldFire jeep.
|
| 95 |
+
# • The vehicle queue (`weap` war factory) needs `powr` online
|
| 96 |
+
# AND a `fix` service depot present for `2tnk` to clear its
|
| 97 |
+
# prerequisites — both are pre-placed in every base so the
|
| 98 |
+
# tank counter is producible from turn 1 (the decision is
|
| 99 |
+
# composition, not tech-up). `tent` produces e1/e3 with `powr`
|
| 100 |
+
# alone.
|
| 101 |
# • Spawn-group footgun (CLAUDE.md oramap): on hard, ANY agent
|
| 102 |
# actor with `spawn_point` filters OUT every agent actor without
|
| 103 |
+
# one — so BOTH bases (fact + powr + tent + weap + fix + jeep)
|
| 104 |
+
# are duplicated under BOTH spawn_point groups at their
|
| 105 |
+
# respective coords. The single far enemy fact at (124,20) and
|
| 106 |
+
# the centre enemy cluster have NO spawn_point and place on
|
| 107 |
+
# every seed.
|
| 108 |
# • starting_cash $2550 = exactly 3× 2tnk; 8× e3 = $2400 (cash
|
| 109 |
# unspent); 25× e1 = $2500. Neither rocket nor rifle mass is
|
| 110 |
# dominant against the entrenched enemy.
|
|
|
|
| 156 |
planning: true
|
| 157 |
termination: {max_ticks: 8000}
|
| 158 |
# Default base (overridden on hard). The starter jeep gives turn-1
|
| 159 |
+
# scouting (sight 7c). It is `stance: 0` (HoldFire) so it never
|
| 160 |
+
# auto-fires — a stall policy cannot score kills with it.
|
| 161 |
actors:
|
| 162 |
- {type: fact, owner: agent, position: [10, 20]}
|
| 163 |
+
- {type: powr, owner: agent, position: [10, 16]}
|
| 164 |
- {type: tent, owner: agent, position: [14, 18]}
|
| 165 |
- {type: weap, owner: agent, position: [14, 22]}
|
| 166 |
+
- {type: fix, owner: agent, position: [18, 20]}
|
| 167 |
+
- {type: jeep, owner: agent, position: [12, 20], stance: 0}
|
| 168 |
# Far persistent enemy marker — prevents engine auto-done when
|
| 169 |
# the live infantry cluster falls so the win/fail evaluator sees
|
| 170 |
# the terminal frame.
|
|
|
|
| 174 |
# ── EASY ─────────────────────────────────────────────────────────
|
| 175 |
# Bare counter-selection skill: a small visible enemy infantry
|
| 176 |
# cluster (8× e1) on the centre lane. Cash $2550 funds 3× 2tnk
|
| 177 |
+
# (the right counter) cleanly. The win requires fielding 3× 2tnk
|
| 178 |
+
# AND 6 kills; stall / wrong-counter never field the tanks → kill
|
| 179 |
+
# bar + 2tnk:3 clause unmet → after_ticks LOSS.
|
| 180 |
easy:
|
| 181 |
description: >
|
| 182 |
Cash $2550. The enemy is a small cluster of 8 rifle infantry
|
|
|
|
| 188 |
armoured fist (2tnk @ $850, exactly 3 for $2550) OR rifle
|
| 189 |
infantry (e1 @ $100, up to 25). Scout the enemy with the jeep,
|
| 190 |
pick the correct hard counter, and commit the whole budget.
|
| 191 |
+
Win when you have fielded 3 medium tanks (2tnk) AND 6 enemy
|
| 192 |
+
units are killed AND your construction yard still stands,
|
| 193 |
+
before tick 5400. Stalling or picking the wrong counter never
|
| 194 |
+
fields the 3-tank fist and fails the bar.
|
| 195 |
overrides:
|
| 196 |
actors:
|
| 197 |
- {type: fact, owner: agent, position: [10, 20]}
|
| 198 |
+
- {type: powr, owner: agent, position: [10, 16]}
|
| 199 |
- {type: tent, owner: agent, position: [14, 18]}
|
| 200 |
- {type: weap, owner: agent, position: [14, 22]}
|
| 201 |
+
- {type: fix, owner: agent, position: [18, 20]}
|
| 202 |
+
- {type: jeep, owner: agent, position: [12, 20], stance: 0}
|
| 203 |
# 8× e1 entrenched cluster — stance:3 (attack anything in
|
| 204 |
# range). Spread over a 3×3 grid centred on (70,20) so the
|
| 205 |
# mass can be engaged from any approach axis.
|
|
|
|
| 214 |
- {type: fact, owner: enemy, position: [124, 20]}
|
| 215 |
win_condition:
|
| 216 |
all_of:
|
| 217 |
+
- {unit_type_count_gte: {type: 2tnk, n: 3}}
|
| 218 |
- {units_killed_gte: 6}
|
|
|
|
| 219 |
- {has_building: fact}
|
| 220 |
- {within_ticks: 5400}
|
| 221 |
fail_condition:
|
| 222 |
any_of:
|
| 223 |
- {after_ticks: 5401}
|
|
|
|
| 224 |
- {not: {has_building: fact}}
|
| 225 |
+
- {not: {own_units_gte: 1}}
|
| 226 |
max_turns: 60
|
| 227 |
|
| 228 |
# ── MEDIUM ───────────────────────────────────────────────────────
|
|
|
|
| 244 |
through small-arms fire. Mass rockets (e3 @ $300) waste cost-
|
| 245 |
per-effect against soft targets and get out-DPSed by the rifle
|
| 246 |
mass on attrition; matching with own rifles (e1 @ $100) is a
|
| 247 |
+
1:1 trade with no advantage. Win when you have fielded 3
|
| 248 |
+
medium tanks (2tnk) AND 8 enemy units are killed AND your fact
|
| 249 |
+
still stands, before tick 5400.
|
| 250 |
overrides:
|
| 251 |
actors:
|
| 252 |
- {type: fact, owner: agent, position: [10, 20]}
|
| 253 |
+
- {type: powr, owner: agent, position: [10, 16]}
|
| 254 |
- {type: tent, owner: agent, position: [14, 18]}
|
| 255 |
- {type: weap, owner: agent, position: [14, 22]}
|
| 256 |
+
- {type: fix, owner: agent, position: [18, 20]}
|
| 257 |
+
- {type: jeep, owner: agent, position: [12, 20], stance: 0}
|
| 258 |
# 12× e1 entrenched cluster — deeper centre mass.
|
| 259 |
- {type: e1, owner: enemy, position: [70, 17], stance: 3}
|
| 260 |
- {type: e1, owner: enemy, position: [70, 18], stance: 3}
|
|
|
|
| 271 |
- {type: fact, owner: enemy, position: [124, 20]}
|
| 272 |
win_condition:
|
| 273 |
all_of:
|
| 274 |
+
- {unit_type_count_gte: {type: 2tnk, n: 3}}
|
| 275 |
- {units_killed_gte: 8}
|
|
|
|
| 276 |
- {has_building: fact}
|
| 277 |
- {within_ticks: 5400}
|
| 278 |
fail_condition:
|
| 279 |
any_of:
|
| 280 |
- {after_ticks: 5401}
|
|
|
|
| 281 |
- {not: {has_building: fact}}
|
| 282 |
+
- {not: {own_units_gte: 1}}
|
| 283 |
max_turns: 60
|
| 284 |
|
| 285 |
# ── HARD ─────────────────────────────────────────────────────────
|
|
|
|
| 303 |
medium tanks (2tnk) walk through small-arms fire; mass rockets
|
| 304 |
(e3) waste cost-per-effect against soft targets and get out-
|
| 305 |
DPSed on attrition; matching with own rifles is a 1:1 trade
|
| 306 |
+
with no advantage. Win when you have fielded 3 medium tanks
|
| 307 |
+
(2tnk) AND 8 enemy units are killed AND your fact still
|
| 308 |
+
stands, before tick 5400.
|
| 309 |
overrides:
|
| 310 |
actors:
|
| 311 |
# ── AGENT spawn 0 — NORTH base (y=12) ─────────────────────
|
| 312 |
- {type: fact, owner: agent, position: [10, 12], spawn_point: 0}
|
| 313 |
+
- {type: powr, owner: agent, position: [10, 8], spawn_point: 0}
|
| 314 |
- {type: tent, owner: agent, position: [14, 10], spawn_point: 0}
|
| 315 |
- {type: weap, owner: agent, position: [14, 14], spawn_point: 0}
|
| 316 |
+
- {type: fix, owner: agent, position: [18, 12], spawn_point: 0}
|
| 317 |
+
- {type: jeep, owner: agent, position: [12, 12], stance: 0, spawn_point: 0}
|
| 318 |
# ── AGENT spawn 1 — SOUTH base (y=28) ─────────────────────
|
| 319 |
- {type: fact, owner: agent, position: [10, 28], spawn_point: 1}
|
| 320 |
+
- {type: powr, owner: agent, position: [10, 32], spawn_point: 1}
|
| 321 |
- {type: tent, owner: agent, position: [14, 26], spawn_point: 1}
|
| 322 |
- {type: weap, owner: agent, position: [14, 30], spawn_point: 1}
|
| 323 |
+
- {type: fix, owner: agent, position: [18, 28], spawn_point: 1}
|
| 324 |
+
- {type: jeep, owner: agent, position: [12, 28], stance: 0, spawn_point: 1}
|
| 325 |
# ── CENTRE ENEMY CLUSTER — pure infantry (always places) ──
|
| 326 |
# 12× e1 entrenched on the central lane. Per CLAUDE.md, enemy
|
| 327 |
# actors do not honour spawn_point — this cluster lands on
|
|
|
|
| 343 |
- {type: fact, owner: enemy, position: [124, 20]}
|
| 344 |
win_condition:
|
| 345 |
all_of:
|
| 346 |
+
- {unit_type_count_gte: {type: 2tnk, n: 3}}
|
| 347 |
- {units_killed_gte: 8}
|
|
|
|
| 348 |
- {has_building: fact}
|
| 349 |
- {within_ticks: 5400}
|
| 350 |
fail_condition:
|
| 351 |
any_of:
|
| 352 |
- {after_ticks: 5401}
|
|
|
|
| 353 |
- {not: {has_building: fact}}
|
| 354 |
+
- {not: {own_units_gte: 1}}
|
| 355 |
max_turns: 60
|
|
@@ -10,16 +10,27 @@ tank ordnance against soft targets — cost-per-effect waste + the
|
|
| 10 |
rocket squad's short stand-off + low HP gets out-DPSed by the rifle
|
| 11 |
mass); matching with own rifles is a 1:1 attrition that loses.
|
| 12 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
The bar (per the spec):
|
| 14 |
-
• stall (only observe) → LOSS (
|
| 15 |
-
|
| 16 |
-
• build-only-
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
|
|
|
|
|
|
|
|
|
|
| 23 |
"""
|
| 24 |
from __future__ import annotations
|
| 25 |
|
|
@@ -67,8 +78,13 @@ def test_pack_compiles_and_meta_fields_populated():
|
|
| 67 |
)
|
| 68 |
|
| 69 |
|
| 70 |
-
def _ctx(*,
|
| 71 |
-
"""Synthesize a WinContext for predicate-level checks.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 72 |
import types
|
| 73 |
|
| 74 |
sig = types.SimpleNamespace(
|
|
@@ -78,72 +94,105 @@ def _ctx(*, units=(), tick=1000, kills=0, lost=0, has_fact=True):
|
|
| 78 |
cash=0,
|
| 79 |
resources=0,
|
| 80 |
own_buildings=[],
|
| 81 |
-
own_building_types=
|
|
|
|
|
|
|
|
|
|
|
|
|
| 82 |
enemies_seen_ids=set(),
|
| 83 |
enemy_buildings_seen_ids=set(),
|
| 84 |
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 85 |
return WinContext(
|
| 86 |
signals=sig,
|
| 87 |
render_state={"units_summary": list(units)},
|
| 88 |
)
|
| 89 |
|
| 90 |
|
| 91 |
-
def _alive(n, unit_type="2tnk"):
|
| 92 |
-
return [
|
| 93 |
-
{"cell_x": 30, "cell_y": 20, "type": unit_type, "id": str(1000 + i)}
|
| 94 |
-
for i in range(n)
|
| 95 |
-
]
|
| 96 |
-
|
| 97 |
-
|
| 98 |
def test_easy_predicates():
|
| 99 |
c = compile_level(load_pack(PACK_PATH), "easy")
|
| 100 |
-
# Intended:
|
| 101 |
-
assert evaluate(c.win_condition, _ctx(
|
|
|
|
|
|
|
| 102 |
# Kill bar unmet (only 5 kills) → not a win
|
| 103 |
-
assert not evaluate(c.win_condition, _ctx(
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 104 |
# Force wipe (all units dead) → fail via not own_units_gte:1
|
| 105 |
assert evaluate(c.fail_condition, _ctx(units=[], tick=2000, kills=6, lost=4))
|
| 106 |
# Fact destroyed → fail via not has_building:fact
|
| 107 |
assert evaluate(
|
| 108 |
c.fail_condition,
|
| 109 |
-
_ctx(
|
| 110 |
)
|
| 111 |
# Timeout with bar unmet → fail (after_ticks 5401 reachable)
|
| 112 |
-
assert evaluate(c.fail_condition, _ctx(
|
| 113 |
|
| 114 |
|
| 115 |
def test_medium_predicates():
|
| 116 |
c = compile_level(load_pack(PACK_PATH), "medium")
|
| 117 |
-
# Intended:
|
| 118 |
-
assert evaluate(c.win_condition, _ctx(
|
|
|
|
|
|
|
| 119 |
# Bar unmet (only 7 kills) → not a win
|
| 120 |
-
assert not evaluate(c.win_condition, _ctx(
|
| 121 |
# Force wipe → fail
|
| 122 |
assert evaluate(c.fail_condition, _ctx(units=[], tick=2000, kills=8, lost=4))
|
| 123 |
# Fact destroyed → fail
|
| 124 |
assert evaluate(
|
| 125 |
c.fail_condition,
|
| 126 |
-
_ctx(
|
| 127 |
)
|
| 128 |
# Timeout → fail
|
| 129 |
-
assert evaluate(c.fail_condition, _ctx(
|
| 130 |
|
| 131 |
|
| 132 |
def test_hard_predicates():
|
| 133 |
c = compile_level(load_pack(PACK_PATH), "hard")
|
| 134 |
-
# Intended:
|
| 135 |
-
assert evaluate(c.win_condition, _ctx(
|
|
|
|
|
|
|
| 136 |
# Bar unmet → not a win
|
| 137 |
-
assert not evaluate(c.win_condition, _ctx(
|
| 138 |
# Force wipe → fail
|
| 139 |
assert evaluate(c.fail_condition, _ctx(units=[], tick=2000, kills=8, lost=4))
|
| 140 |
# Fact destroyed → fail
|
| 141 |
assert evaluate(
|
| 142 |
c.fail_condition,
|
| 143 |
-
_ctx(
|
| 144 |
)
|
| 145 |
# Timeout → fail
|
| 146 |
-
assert evaluate(c.fail_condition, _ctx(
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 147 |
|
| 148 |
|
| 149 |
def test_timeout_reachable_inside_max_turns():
|
|
@@ -198,17 +247,37 @@ def test_enemy_is_pure_infantry_no_anti_armour():
|
|
| 198 |
assert n_e1 >= 6, f"{lvl}: needs ≥6 e1 in the enemy cluster; got {n_e1}"
|
| 199 |
|
| 200 |
|
| 201 |
-
def
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 202 |
"""The composition decision is COMPOSITION, not tech-up. Each
|
| 203 |
-
spawn group on every level must have
|
| 204 |
-
|
| 205 |
-
|
| 206 |
-
|
| 207 |
-
|
| 208 |
pack = load_pack(PACK_PATH)
|
| 209 |
for lvl in ("easy", "medium", "hard"):
|
| 210 |
c = compile_level(pack, lvl)
|
| 211 |
-
# Per spawn group, the agent must have
|
| 212 |
# On non-hard levels there is exactly one (default) spawn
|
| 213 |
# group (spawn_point None → 0); on hard there are two.
|
| 214 |
groups: dict[int, list] = {}
|
|
@@ -219,7 +288,7 @@ def test_agent_base_has_both_production_queues():
|
|
| 219 |
groups.setdefault(g, []).append(a.type)
|
| 220 |
assert groups, f"{lvl}: no agent actors found"
|
| 221 |
for g, ts in groups.items():
|
| 222 |
-
for need in ("fact", "tent", "weap", "jeep"):
|
| 223 |
assert need in ts, (
|
| 224 |
f"{lvl}: spawn group {g} missing {need}; got {ts}"
|
| 225 |
)
|
|
@@ -244,33 +313,137 @@ def test_starting_cash_funds_exactly_one_pure_composition():
|
|
| 244 |
assert 25 * 100 == 2500
|
| 245 |
|
| 246 |
|
| 247 |
-
# ── engine-driven scripted
|
| 248 |
#
|
| 249 |
-
# The full RPS-counter bar (
|
| 250 |
-
# 2tnk WINS)
|
| 251 |
-
#
|
| 252 |
-
|
| 253 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 254 |
|
| 255 |
|
| 256 |
def _stall(rs, Command):
|
| 257 |
-
"""Pure observe —
|
|
|
|
|
|
|
| 258 |
return [Command.observe()]
|
| 259 |
|
| 260 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 261 |
@pytest.mark.parametrize("level", ["easy", "medium", "hard"])
|
| 262 |
def test_stall_loses(level):
|
| 263 |
-
"""Stall must be a real
|
| 264 |
-
|
| 265 |
-
|
| 266 |
-
|
| 267 |
pytest.importorskip("openra_train")
|
| 268 |
from openra_bench.eval_core import run_level
|
| 269 |
|
| 270 |
c = compile_level(load_pack(PACK_PATH), level)
|
| 271 |
-
|
| 272 |
-
|
| 273 |
-
|
| 274 |
-
|
| 275 |
-
|
| 276 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
rocket squad's short stand-off + low HP gets out-DPSed by the rifle
|
| 11 |
mass); matching with own rifles is a 1:1 attrition that loses.
|
| 12 |
|
| 13 |
+
The win predicate is `unit_type_count_gte 2tnk:3 AND units_killed_gte
|
| 14 |
+
K AND has_building fact` — the 2tnk:3 clause is the load-bearing
|
| 15 |
+
anti-cheat: only a policy that ACTUALLY BUILDS the 3-tank fist can
|
| 16 |
+
clear the bar. (The armour-class engine fix on OpenRA-Rust main made
|
| 17 |
+
pre-placed agent combat units auto-fire effectively, so the starter
|
| 18 |
+
jeep is `stance: 0` HoldFire — it scouts, it cannot rack up kills on
|
| 19 |
+
its own.)
|
| 20 |
+
|
| 21 |
The bar (per the spec):
|
| 22 |
+
• stall (only observe) → LOSS (no 2tnk, no kills; the
|
| 23 |
+
idle HoldFire jeep is hunted down → force-wipe / after_ticks)
|
| 24 |
+
• build-only-e1 (match 1:1) → LOSS (never builds 2tnk → the
|
| 25 |
+
2tnk:3 clause is structurally unmet)
|
| 26 |
+
• build-only-e3 (wrong counter) → LOSS (never builds 2tnk → the
|
| 27 |
+
2tnk:3 clause is structurally unmet)
|
| 28 |
+
• intended build-2tnk → WIN (3 medium tanks walk
|
| 29 |
+
through the e1 mass; 2tnk:3 + kill bar both latch)
|
| 30 |
+
|
| 31 |
+
Validation is scripted (no model / network) — every policy is
|
| 32 |
+
exercised against the live engine on every level and every hard
|
| 33 |
+
seed 1..4.
|
| 34 |
"""
|
| 35 |
from __future__ import annotations
|
| 36 |
|
|
|
|
| 78 |
)
|
| 79 |
|
| 80 |
|
| 81 |
+
def _ctx(*, tanks=0, tick=1000, kills=0, lost=0, has_fact=True, units=None):
|
| 82 |
+
"""Synthesize a WinContext for predicate-level checks.
|
| 83 |
+
|
| 84 |
+
`tanks` synthesizes that many 2tnk units in `units_summary`;
|
| 85 |
+
pass `units` explicitly to model a different composition (or an
|
| 86 |
+
empty force).
|
| 87 |
+
"""
|
| 88 |
import types
|
| 89 |
|
| 90 |
sig = types.SimpleNamespace(
|
|
|
|
| 94 |
cash=0,
|
| 95 |
resources=0,
|
| 96 |
own_buildings=[],
|
| 97 |
+
own_building_types=(
|
| 98 |
+
{"fact", "powr", "tent", "weap", "fix"}
|
| 99 |
+
if has_fact
|
| 100 |
+
else {"powr", "tent", "weap", "fix"}
|
| 101 |
+
),
|
| 102 |
enemies_seen_ids=set(),
|
| 103 |
enemy_buildings_seen_ids=set(),
|
| 104 |
)
|
| 105 |
+
if units is None:
|
| 106 |
+
units = [
|
| 107 |
+
{"cell_x": 30, "cell_y": 20, "type": "2tnk", "id": str(1000 + i)}
|
| 108 |
+
for i in range(tanks)
|
| 109 |
+
]
|
| 110 |
return WinContext(
|
| 111 |
signals=sig,
|
| 112 |
render_state={"units_summary": list(units)},
|
| 113 |
)
|
| 114 |
|
| 115 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 116 |
def test_easy_predicates():
|
| 117 |
c = compile_level(load_pack(PACK_PATH), "easy")
|
| 118 |
+
# Intended: 3 tanks fielded, 6 kills, fact still up, in time → WIN
|
| 119 |
+
assert evaluate(c.win_condition, _ctx(tanks=3, tick=2000, kills=6))
|
| 120 |
+
# Only 2 tanks fielded → 2tnk:3 clause unmet → not a win
|
| 121 |
+
assert not evaluate(c.win_condition, _ctx(tanks=2, tick=2000, kills=6))
|
| 122 |
# Kill bar unmet (only 5 kills) → not a win
|
| 123 |
+
assert not evaluate(c.win_condition, _ctx(tanks=3, tick=2000, kills=5))
|
| 124 |
+
# Wrong counter: 8 e3 fielded, kill bar met, but 0 tanks → not a win
|
| 125 |
+
e3s = [
|
| 126 |
+
{"cell_x": 30, "cell_y": 20, "type": "e3", "id": str(2000 + i)}
|
| 127 |
+
for i in range(8)
|
| 128 |
+
]
|
| 129 |
+
assert not evaluate(c.win_condition, _ctx(units=e3s, tick=2000, kills=6))
|
| 130 |
# Force wipe (all units dead) → fail via not own_units_gte:1
|
| 131 |
assert evaluate(c.fail_condition, _ctx(units=[], tick=2000, kills=6, lost=4))
|
| 132 |
# Fact destroyed → fail via not has_building:fact
|
| 133 |
assert evaluate(
|
| 134 |
c.fail_condition,
|
| 135 |
+
_ctx(tanks=3, tick=2000, kills=6, has_fact=False),
|
| 136 |
)
|
| 137 |
# Timeout with bar unmet → fail (after_ticks 5401 reachable)
|
| 138 |
+
assert evaluate(c.fail_condition, _ctx(tanks=3, tick=5402, kills=5))
|
| 139 |
|
| 140 |
|
| 141 |
def test_medium_predicates():
|
| 142 |
c = compile_level(load_pack(PACK_PATH), "medium")
|
| 143 |
+
# Intended: 3 tanks, 8 kills, fact still up → WIN
|
| 144 |
+
assert evaluate(c.win_condition, _ctx(tanks=3, tick=2000, kills=8))
|
| 145 |
+
# Only 2 tanks → 2tnk:3 clause unmet → not a win
|
| 146 |
+
assert not evaluate(c.win_condition, _ctx(tanks=2, tick=2000, kills=8))
|
| 147 |
# Bar unmet (only 7 kills) → not a win
|
| 148 |
+
assert not evaluate(c.win_condition, _ctx(tanks=3, tick=2000, kills=7))
|
| 149 |
# Force wipe → fail
|
| 150 |
assert evaluate(c.fail_condition, _ctx(units=[], tick=2000, kills=8, lost=4))
|
| 151 |
# Fact destroyed → fail
|
| 152 |
assert evaluate(
|
| 153 |
c.fail_condition,
|
| 154 |
+
_ctx(tanks=3, tick=2000, kills=8, has_fact=False),
|
| 155 |
)
|
| 156 |
# Timeout → fail
|
| 157 |
+
assert evaluate(c.fail_condition, _ctx(tanks=3, tick=5402, kills=7))
|
| 158 |
|
| 159 |
|
| 160 |
def test_hard_predicates():
|
| 161 |
c = compile_level(load_pack(PACK_PATH), "hard")
|
| 162 |
+
# Intended: 3 tanks, 8 kills, fact up → WIN
|
| 163 |
+
assert evaluate(c.win_condition, _ctx(tanks=3, tick=2000, kills=8))
|
| 164 |
+
# Only 2 tanks → 2tnk:3 clause unmet → not a win
|
| 165 |
+
assert not evaluate(c.win_condition, _ctx(tanks=2, tick=2000, kills=8))
|
| 166 |
# Bar unmet → not a win
|
| 167 |
+
assert not evaluate(c.win_condition, _ctx(tanks=3, tick=2000, kills=7))
|
| 168 |
# Force wipe → fail
|
| 169 |
assert evaluate(c.fail_condition, _ctx(units=[], tick=2000, kills=8, lost=4))
|
| 170 |
# Fact destroyed → fail
|
| 171 |
assert evaluate(
|
| 172 |
c.fail_condition,
|
| 173 |
+
_ctx(tanks=3, tick=2000, kills=8, has_fact=False),
|
| 174 |
)
|
| 175 |
# Timeout → fail
|
| 176 |
+
assert evaluate(c.fail_condition, _ctx(tanks=3, tick=5402, kills=7))
|
| 177 |
+
|
| 178 |
+
|
| 179 |
+
def test_win_requires_three_medium_tanks():
|
| 180 |
+
"""The load-bearing anti-cheat: every level's win predicate must
|
| 181 |
+
require `unit_type_count_gte 2tnk:3` — a stall / wrong-counter
|
| 182 |
+
policy that never builds the medium-tank fist can never win
|
| 183 |
+
regardless of how many kills the entrenched enemy concedes."""
|
| 184 |
+
pack = load_pack(PACK_PATH)
|
| 185 |
+
for lvl in ("easy", "medium", "hard"):
|
| 186 |
+
c = compile_level(pack, lvl)
|
| 187 |
+
# 0 tanks, kill bar trivially exceeded, fact up, in time → NOT a win.
|
| 188 |
+
assert not evaluate(c.win_condition, _ctx(tanks=0, tick=2000, kills=99)), (
|
| 189 |
+
f"{lvl}: win must require 3 fielded 2tnk — a 0-tank policy "
|
| 190 |
+
f"with the kill bar met must NOT win (anti-cheat clause)"
|
| 191 |
+
)
|
| 192 |
+
# 3 tanks + kill bar met + fact up + in time → WIN.
|
| 193 |
+
assert evaluate(c.win_condition, _ctx(tanks=3, tick=2000, kills=99)), (
|
| 194 |
+
f"{lvl}: 3 fielded 2tnk + kill bar met must WIN"
|
| 195 |
+
)
|
| 196 |
|
| 197 |
|
| 198 |
def test_timeout_reachable_inside_max_turns():
|
|
|
|
| 247 |
assert n_e1 >= 6, f"{lvl}: needs ≥6 e1 in the enemy cluster; got {n_e1}"
|
| 248 |
|
| 249 |
|
| 250 |
+
def test_starter_jeep_is_hold_fire():
|
| 251 |
+
"""The armour-class engine fix made pre-placed agent combat units
|
| 252 |
+
auto-fire effectively. The starter jeep must be `stance: 0`
|
| 253 |
+
(HoldFire) on every spawn group of every level so a pure-observe
|
| 254 |
+
stall policy cannot rack up kills with it for free."""
|
| 255 |
+
pack = load_pack(PACK_PATH)
|
| 256 |
+
for lvl in ("easy", "medium", "hard"):
|
| 257 |
+
c = compile_level(pack, lvl)
|
| 258 |
+
jeeps = [
|
| 259 |
+
a for a in c.scenario.actors
|
| 260 |
+
if a.owner == "agent" and a.type == "jeep"
|
| 261 |
+
]
|
| 262 |
+
assert jeeps, f"{lvl}: needs a starter jeep"
|
| 263 |
+
for j in jeeps:
|
| 264 |
+
assert j.stance == 0, (
|
| 265 |
+
f"{lvl}: starter jeep must be stance:0 (HoldFire) so a "
|
| 266 |
+
f"stall policy cannot score free kills; got stance={j.stance}"
|
| 267 |
+
)
|
| 268 |
+
|
| 269 |
+
|
| 270 |
+
def test_agent_base_can_build_both_counters():
|
| 271 |
"""The composition decision is COMPOSITION, not tech-up. Each
|
| 272 |
+
spawn group on every level must have the buildings that make BOTH
|
| 273 |
+
counters producible from turn 1: tent (e1/e3), weap+powr+fix
|
| 274 |
+
(2tnk — the war-factory vehicle queue needs power online AND a
|
| 275 |
+
service depot for the medium tank to clear its prerequisites).
|
| 276 |
+
The starter jeep must also be present."""
|
| 277 |
pack = load_pack(PACK_PATH)
|
| 278 |
for lvl in ("easy", "medium", "hard"):
|
| 279 |
c = compile_level(pack, lvl)
|
| 280 |
+
# Per spawn group, the agent must have the full base + jeep.
|
| 281 |
# On non-hard levels there is exactly one (default) spawn
|
| 282 |
# group (spawn_point None → 0); on hard there are two.
|
| 283 |
groups: dict[int, list] = {}
|
|
|
|
| 288 |
groups.setdefault(g, []).append(a.type)
|
| 289 |
assert groups, f"{lvl}: no agent actors found"
|
| 290 |
for g, ts in groups.items():
|
| 291 |
+
for need in ("fact", "powr", "tent", "weap", "fix", "jeep"):
|
| 292 |
assert need in ts, (
|
| 293 |
f"{lvl}: spawn group {g} missing {need}; got {ts}"
|
| 294 |
)
|
|
|
|
| 313 |
assert 25 * 100 == 2500
|
| 314 |
|
| 315 |
|
| 316 |
+
# ── engine-driven scripted policies ─────────────────────────────────
|
| 317 |
#
|
| 318 |
+
# The full RPS-counter bar (stall LOSES / build-e3 LOSES / build-e1
|
| 319 |
+
# LOSES / build-2tnk WINS) is exercised against the live engine on
|
| 320 |
+
# every level and every hard seed 1..4.
|
| 321 |
+
|
| 322 |
+
|
| 323 |
+
def _own_units(rs, *, type_filter=None):
|
| 324 |
+
out = []
|
| 325 |
+
for u in (rs.get("units_summary", []) or []):
|
| 326 |
+
if type_filter and (u.get("type") or "").lower() not in type_filter:
|
| 327 |
+
continue
|
| 328 |
+
out.append(u)
|
| 329 |
+
return out
|
| 330 |
+
|
| 331 |
+
|
| 332 |
+
def _enemy_infantry(rs):
|
| 333 |
+
return [
|
| 334 |
+
e for e in (rs.get("enemy_summary") or [])
|
| 335 |
+
if (e.get("type") or "").lower() == "e1" and not e.get("is_building")
|
| 336 |
+
]
|
| 337 |
|
| 338 |
|
| 339 |
def _stall(rs, Command):
|
| 340 |
+
"""Pure observe — no production. The HoldFire jeep never fires →
|
| 341 |
+
0 kills, 0 tanks; the 2tnk:3 win clause never latches → LOSS
|
| 342 |
+
(force-wipe when the e1 swarm hunts the jeep, or after_ticks)."""
|
| 343 |
return [Command.observe()]
|
| 344 |
|
| 345 |
|
| 346 |
+
def _make_build_policy(unit_type, cost):
|
| 347 |
+
"""Queue `unit_type` every turn the budget allows and send each
|
| 348 |
+
produced unit at the enemy infantry cluster."""
|
| 349 |
+
|
| 350 |
+
def policy(rs, Command):
|
| 351 |
+
cmds = []
|
| 352 |
+
if rs.get("cash", 0) >= cost:
|
| 353 |
+
cmds.append(Command.build(unit_type))
|
| 354 |
+
fighters = _own_units(rs, type_filter={unit_type})
|
| 355 |
+
targets = _enemy_infantry(rs)
|
| 356 |
+
for u in fighters:
|
| 357 |
+
if targets:
|
| 358 |
+
cmds.append(
|
| 359 |
+
Command.attack_unit([str(u["id"])], str(targets[0]["id"]))
|
| 360 |
+
)
|
| 361 |
+
else:
|
| 362 |
+
cmds.append(Command.attack_move([str(u["id"])], 70, 20))
|
| 363 |
+
return cmds if cmds else [Command.observe()]
|
| 364 |
+
|
| 365 |
+
return policy
|
| 366 |
+
|
| 367 |
+
|
| 368 |
+
_build_e3 = _make_build_policy("e3", 300)
|
| 369 |
+
_build_e1 = _make_build_policy("e1", 100)
|
| 370 |
+
_build_2tnk = _make_build_policy("2tnk", 850)
|
| 371 |
+
|
| 372 |
+
|
| 373 |
@pytest.mark.parametrize("level", ["easy", "medium", "hard"])
|
| 374 |
def test_stall_loses(level):
|
| 375 |
+
"""Stall must be a real LOSS on every level and every hard seed
|
| 376 |
+
(no draw): the win predicate requires `unit_type_count_gte
|
| 377 |
+
2tnk:3` which a pure-observe policy can never satisfy, and the
|
| 378 |
+
idle HoldFire jeep is hunted down → force-wipe / after_ticks."""
|
| 379 |
pytest.importorskip("openra_train")
|
| 380 |
from openra_bench.eval_core import run_level
|
| 381 |
|
| 382 |
c = compile_level(load_pack(PACK_PATH), level)
|
| 383 |
+
seeds = (1, 2, 3, 4) if level == "hard" else (1,)
|
| 384 |
+
for s in seeds:
|
| 385 |
+
r = run_level(c, _stall, seed=s)
|
| 386 |
+
assert r.outcome == "loss", (
|
| 387 |
+
f"{level} seed={s}: stall must be a real LOSS (no 2tnk → "
|
| 388 |
+
f"win clause unmet); got {r.outcome} after {r.turns} turns "
|
| 389 |
+
f"(kills={r.signals.units_killed}, lost={r.signals.units_lost})"
|
| 390 |
+
)
|
| 391 |
+
|
| 392 |
+
|
| 393 |
+
@pytest.mark.parametrize("level", ["easy", "medium", "hard"])
|
| 394 |
+
def test_build_e3_wrong_counter_loses(level):
|
| 395 |
+
"""Mass anti-tank rockets are the WRONG counter — and crucially
|
| 396 |
+
the policy never builds 2tnk, so the `unit_type_count_gte 2tnk:3`
|
| 397 |
+
win clause is structurally unmet → real LOSS on every level and
|
| 398 |
+
every hard seed."""
|
| 399 |
+
pytest.importorskip("openra_train")
|
| 400 |
+
from openra_bench.eval_core import run_level
|
| 401 |
+
|
| 402 |
+
c = compile_level(load_pack(PACK_PATH), level)
|
| 403 |
+
seeds = (1, 2, 3, 4) if level == "hard" else (1,)
|
| 404 |
+
for s in seeds:
|
| 405 |
+
r = run_level(c, _build_e3, seed=s)
|
| 406 |
+
assert r.outcome == "loss", (
|
| 407 |
+
f"{level} seed={s}: build-e3 wrong-counter must LOSE (no "
|
| 408 |
+
f"2tnk → win clause unmet); got {r.outcome} "
|
| 409 |
+
f"(kills={r.signals.units_killed}, lost={r.signals.units_lost})"
|
| 410 |
+
)
|
| 411 |
+
|
| 412 |
+
|
| 413 |
+
@pytest.mark.parametrize("level", ["easy", "medium", "hard"])
|
| 414 |
+
def test_build_e1_wrong_counter_loses(level):
|
| 415 |
+
"""Matching the enemy 1:1 with own rifles never builds 2tnk, so
|
| 416 |
+
the `unit_type_count_gte 2tnk:3` win clause is structurally unmet
|
| 417 |
+
→ real LOSS on every level and every hard seed."""
|
| 418 |
+
pytest.importorskip("openra_train")
|
| 419 |
+
from openra_bench.eval_core import run_level
|
| 420 |
+
|
| 421 |
+
c = compile_level(load_pack(PACK_PATH), level)
|
| 422 |
+
seeds = (1, 2, 3, 4) if level == "hard" else (1,)
|
| 423 |
+
for s in seeds:
|
| 424 |
+
r = run_level(c, _build_e1, seed=s)
|
| 425 |
+
assert r.outcome == "loss", (
|
| 426 |
+
f"{level} seed={s}: build-e1 wrong-counter must LOSE (no "
|
| 427 |
+
f"2tnk → win clause unmet); got {r.outcome} "
|
| 428 |
+
f"(kills={r.signals.units_killed}, lost={r.signals.units_lost})"
|
| 429 |
+
)
|
| 430 |
+
|
| 431 |
+
|
| 432 |
+
@pytest.mark.parametrize("level", ["easy", "medium", "hard"])
|
| 433 |
+
def test_intended_build_2tnk_wins(level):
|
| 434 |
+
"""The RPS counter pick: build 3× 2tnk (medium tanks) and engage.
|
| 435 |
+
Heavy armour walks through the e1 mass — the `2tnk:3` clause and
|
| 436 |
+
the kill bar both latch. Wins on every level and every hard
|
| 437 |
+
seed 1..4."""
|
| 438 |
+
pytest.importorskip("openra_train")
|
| 439 |
+
from openra_bench.eval_core import run_level
|
| 440 |
+
|
| 441 |
+
c = compile_level(load_pack(PACK_PATH), level)
|
| 442 |
+
seeds = (1, 2, 3, 4) if level == "hard" else (1,)
|
| 443 |
+
for s in seeds:
|
| 444 |
+
r = run_level(c, _build_2tnk, seed=s)
|
| 445 |
+
assert r.outcome == "win", (
|
| 446 |
+
f"{level} seed={s}: intended build-2tnk must WIN; got "
|
| 447 |
+
f"{r.outcome} (kills={r.signals.units_killed}, "
|
| 448 |
+
f"lost={r.signals.units_lost})"
|
| 449 |
+
)
|