yxc20098 commited on
Commit
b77f5b2
·
1 Parent(s): 0fb13a4

fix(scenario): combat-retreat-after-engagement — recalibrate after engine movement fixes

Browse files

The OpenRA-Rust movement fixes (moving units fire and take fire en
route; attack_unit on out-of-sight targets paths normally) regressed
this pack on every tier:
- the close-range trade became far more lethal — killing the kill
quota required losing more tanks than the survival cap allowed, so
the intended engage-then-retreat could not win (medium lost 2 tanks
for 4 kills; the fight was unsolvable inside the loss cap);
- in interrupt mode an action-heavy episode advances FEWER ticks per
turn than a pure stall, so the old after_ticks:4501 deadline was
inert for the intended policy — a non-winning run DREW (tick ~4428
at max_turns) instead of LOSING (draw degeneracy).

Recalibration:
- lighter enemy: easy 2 e3 + 1 3tnk (kill bar 2), medium 3 e3 + 1
3tnk (kill bar 3), hard 2 e3 + 1 3tnk (kill bar 2). e3 set to
stance:2 Defend so they hold the firing line and do not chase the
retreating column (a stance:3 e3 hunts the tanks home and confounds
the retreat).
- deadline pulled down to after_ticks/within_ticks 4000 — crossed by
every policy inside max_turns=51 (a stall crosses ~turn 45, an
action policy before turn 51), so a non-winning run is a real LOSS,
never a draw.
- the intended test policy was rewritten: the old kill-count
inference (peak_visible - visible) misread enemies leaving vision
as kills; the new policy is a clean three-phase approach / engage /
retreat driven by an HP-floor + tank-lost disengage trigger.

Bar verified every level x seeds 1-4: stall / brute-attack-until-
death / never-engage all LOSE (real timeout LOSS, no draws); intended
engage-then-retreat WINS.

openra_bench/scenarios/packs/combat-retreat-after-engagement.yaml CHANGED
@@ -11,62 +11,68 @@
11
  # trades hits for intel/attrition, then breaks contact before the
12
  # enemy can encircle and destroy.
13
  #
 
 
 
 
 
 
 
 
 
 
14
  # Idiom (the four-policy bar — same shape used by combat-kite and
15
  # combat-focus-fire):
16
- # • stall (only observe) → LOSS. The kill bar
17
- # (units_killed_gte:3) is never met after_ticks LOSS.
18
- # brute attack-until-death (commit fully) → LOSS. The heavy
19
- # 3tnk + 5× e3 rocket-infantry mass out-trades 2tnk in a
20
- # stand-up fight at the enemy line; tanks die two-by-two and
21
- # `own_units_gte:3` flips to fail before the survivors can pull
22
- # back.
23
- # • never-engage (sit at safe zone) → LOSS. Three tanks are
24
- # in the safe-zone region the whole episode, but the kill bar is
25
- # unmet → after_ticks LOSS.
26
  # • intended engage-then-retreat → WIN. March east into
27
- # fire range, attack_unit the soft e3 rocket infantry (the kill
28
- # bar is units_killed_gte:3, and e3 are the cheap kills), then
29
- # the moment the first tank is gone (or the second is hurt) issue
30
- # a move_units order WEST back to the safe-zone region (x≈5,
31
- # y≈20, r=6). All four tanks need not survive the bar is
32
- # own_units_gte:3 (lose at most ONE tank) AND ≥3 tanks back in
33
  # the safe-zone radius. The decision under test is DISENGAGE
34
  # TIMING: bail BEFORE attrition busts the force-preservation cap.
35
  #
36
- # Why "battle is unwinnable" (verified vs engine combat sheet):
37
- # • 2tnk vs 5× e3 + 3tnk: head-on, the two 3tnks alone
38
- # (Soviet heavy cannon, ~range 5, high anti-armour DPS) trade
39
- # 2tnk-for-3tnk roughly 1:1 and the rocket infantry (e3 Dragon
40
- # launcher, anti-armour) finishes the surviving 2tnks. A brute
41
- # attack_move column dies before clearing the squad the loss
42
- # cap (lose at most 1) flips on tank #2 going down, while only
43
- # ~2 e3 have died.
44
- # • The "engage 3 e3 → break contact" play kills 3 cheap targets
45
- # fast (4-vs-1 focus on each e3 ends it in 1-2 decision turns)
46
- # then bails BEFORE the 3tnks have ground enough HP off the
47
- # column to crater the survival cap.
48
  #
49
  # Engine guardrails:
50
  # • Map: rush-hour-arena (128 × 40, playable x in [2..126],
51
  # y in [2..38]). Tanks stage at the west edge (x=5..7), enemies
52
- # at (60,20). Cells verified placeable per CLAUDE.md (60,28 was
53
- # called out as silently-failing — we use (60,20) and adjacent
54
- # y=17..23).
55
  # • Enemy bot is `guard` (not `hunt`): guard holds post and lunges
56
- # within GUARD_AGGRO~16, snaps back past leash 18. This lets the
57
- # enemy COMMIT during the engagement but NOT pursue the retreating
58
- # tanks all the way home the retreat is genuinely safe once the
59
- # tanks break leash range. Critically, `guard` also means the
60
- # never-engage policy is not punished by enemy approach (the
61
- # enemies stay east), which is exactly what makes never-engage a
62
- # CLEAN kill-bar LOSS rather than a confounded contact loss.
63
  # • Persistent unarmed `fact` far east (x=120, y=20) prevents engine
64
  # auto-done on enemy-elimination collapsing the run to DRAW — if
65
  # the agent accidentally kills everything, the run still evaluates
66
  # the in-region predicate at the safe zone (and the agent likely
67
  # hasn't retreated → fail clause fires on the clock or attrition).
68
- # • after_ticks 4501 93 + 90·(51-1) = 4593 → the stall LOSS is
69
- # reachable inside max_turns; no draw degeneracy.
 
 
 
70
  # • units_lost / units_killed are signal-level integers (combat-
71
  # focus-fire idiom); units_in_region_gte is the geometry predicate
72
  # established by action-multiunit-coordination / artofwar-lure-
@@ -123,27 +129,34 @@ base:
123
 
124
  levels:
125
  # ── EASY ─────────────────────────────────────────────────────────
126
- # Bare retreat skill: 4 tanks vs LESS enemy (4× e3 + 1× 3tnk).
127
- # Loss cap forgiving (own_units_gte:2 lose up to 2 tanks), kill
128
- # bar only 2. The engage-then-retreat play is comfortable; the
129
- # brute may LOSE on the heavy + rocket attrition trade; stall and
130
- # never-engage LOSE on the kill bar.
 
 
 
 
 
 
 
 
 
131
  easy:
132
  description: >
133
  Four medium tanks (2tnk) stage at the safe zone in the west
134
- (x=5, y=18..21). A smaller enemy squad of FOUR rocket
135
- infantry (e3, anti-tank Dragon launcher) on the firing line
136
- at x=60, escorting ONE heavy tank (3tnk) set back at x=64,
137
- holds the centre. The fight is unwinnable head-on the
138
- rocket infantry kill armour fast and the 3tnk cannon
139
- out-trades the column once it overshoots into close range.
140
- The intended play: march east into engagement range, kill TWO
141
- enemy units (the cheap e3s die in 1-2 turns of focused tank
142
- fire), then move your tanks WEST BACK to the safe zone before
143
- attrition busts the loss cap. Win when ≥2 enemies are killed
144
- AND ≥3 of your tanks are within radius 6 of (5,20) AND you
145
- have ≥2 tanks alive, before tick 4500. Stall, brute attack
146
- until-death, and never-engage all LOSE.
147
  overrides:
148
  actors:
149
  # Strike force — 4 medium tanks at the safe-zone start.
@@ -151,21 +164,18 @@ levels:
151
  - {type: 2tnk, owner: agent, position: [5, 19], stance: 1}
152
  - {type: 2tnk, owner: agent, position: [5, 20], stance: 1}
153
  - {type: 2tnk, owner: agent, position: [5, 21], stance: 1}
154
- # Enemy squad — 4× e3 anti-tank rockets ON the firing line
155
- # at x=60 (the kill bar's soft targets), with the 3tnk
156
- # heavy escort set 4 cells BACK at x=64. The set-back heavy
157
- # only weighs in if the agent commits PAST the e3 line —
158
- # i.e. the brute attack_move policy that overshoots so it
159
- # provides the attrition pressure that makes the brute LOSE
160
- # without immediately crushing the intended engage-the-line
161
- # play. (Engine balance fixes made a heavy ON the line too
162
- # lethal for the bare-skill easy tier; setting it back
163
- # mirrors the medium/hard geometry.)
164
- - {type: e3, owner: enemy, position: [60, 18], stance: 3}
165
- - {type: e3, owner: enemy, position: [60, 19], stance: 3}
166
- - {type: e3, owner: enemy, position: [60, 21], stance: 3}
167
- - {type: e3, owner: enemy, position: [60, 22], stance: 3}
168
- - {type: 3tnk, owner: enemy, position: [64, 20], stance: 3}
169
  # Persistent far-east enemy fact — prevents engine auto-done
170
  # on enemy-elimination so the safe-zone predicate is
171
  # evaluated rather than the run collapsing to DRAW.
@@ -175,74 +185,66 @@ levels:
175
  - {units_killed_gte: 2}
176
  - {units_in_region_gte: {x: 5, y: 20, radius: 6, n: 3}}
177
  - {own_units_gte: 2}
178
- - {within_ticks: 4500}
179
  fail_condition:
180
  any_of:
181
- - {after_ticks: 4501}
182
  - {not: {own_units_gte: 2}}
183
  max_turns: 51
184
 
185
  # ── MEDIUM ───────────────────────────────────────────────────────
186
- # +1 controlled variable vs easy: FULL enemy squad (5× e3 + 2× 3tnk)
187
- # AND tighter survival bar (own_units_gte:3 lose AT MOST ONE
188
- # tank). The kill bar tightens to 3. The engage window is shorter:
189
- # the second 3tnk doubles the close-range damage and the brute now
190
- # loses tanks ~twice as fast. The intended engage-then-retreat
191
- # still wins focus-fire 3 e3s (1-2 turns each) and pull back
192
- # before the heavies grind ≥2 tanks down.
 
193
  medium:
194
  description: >
195
  Four medium tanks (2tnk) stage at the safe zone in the west
196
- (x=5, y=18..21). The enemy squad at (60, 20) is FIVE rocket
197
- infantry (e3, anti-tank Dragon launcher) escorting TWO heavy
198
- tanks (3tnk). The fight is unwinnable head-on the rockets
199
- and heavy cannons together collapse the column at close
200
- range. The intended play: march east into engagement range,
201
- focus-fire THREE enemy units (the cheap e3s die in 1-2 turns
202
- of concentrated tank fire each), then move your tanks WEST
203
- BACK to the safe zone before attrition takes a second tank.
204
- Win when ≥3 enemies are killed AND ≥3 of your tanks are
205
- within radius 6 of (5,20) AND you have ≥3 tanks alive,
206
- before tick 4500. Stall, brute attack-until-death, and
207
- never-engage all LOSE.
208
  overrides:
209
  actors:
210
  - {type: 2tnk, owner: agent, position: [5, 18], stance: 1}
211
  - {type: 2tnk, owner: agent, position: [5, 19], stance: 1}
212
  - {type: 2tnk, owner: agent, position: [5, 20], stance: 1}
213
  - {type: 2tnk, owner: agent, position: [5, 21], stance: 1}
214
- # FULL enemy squad — 5× e3 (kill-bar fodder; anti-tank
215
- # Dragon launcher) ON THE LINE at x=60, with the 3tnk
216
- # heavy escort placed 4 cells back at x=64. The e3 line is
217
- # what the attacker must close on to score kills (Dragon
218
- # range 5); from the e3 firing line (~x=55) the rear-rank
219
- # 3tnks (range ~5) cannot yet engage, so they only weigh in
220
- # if the agent commits PAST the e3 line i.e. the brute
221
- # attack_move policy that overshoots. Spread the e3 line
222
- # across y=17..23 so all four tanks face fire.
223
- - {type: e3, owner: enemy, position: [60, 17], stance: 3}
224
- - {type: e3, owner: enemy, position: [60, 18], stance: 3}
225
- - {type: e3, owner: enemy, position: [60, 19], stance: 3}
226
- - {type: e3, owner: enemy, position: [60, 21], stance: 3}
227
- - {type: e3, owner: enemy, position: [60, 22], stance: 3}
228
- # Heavy tanks placed 4 cells BEHIND the rocket line —
229
- # within engagement leash for the guard bot to pursue when
230
- # contact starts (GUARD_AGGRO ~16), but outside their own
231
- # ~5-cell weapon range from the e3 firing line. They are
232
- # the attrition trigger for the brute attack_move policy
233
- # that closes past x=60 into 3tnk cannon range.
234
- - {type: 3tnk, owner: enemy, position: [64, 19], stance: 3}
235
- - {type: 3tnk, owner: enemy, position: [64, 21], stance: 3}
236
  - {type: fact, owner: enemy, position: [120, 20]}
237
  win_condition:
238
  all_of:
239
  - {units_killed_gte: 3}
240
  - {units_in_region_gte: {x: 5, y: 20, radius: 6, n: 3}}
241
  - {own_units_gte: 3}
242
- - {within_ticks: 4500}
243
  fail_condition:
244
  any_of:
245
- - {after_ticks: 4501}
246
  - {not: {own_units_gte: 3}}
247
  max_turns: 51
248
 
@@ -257,6 +259,15 @@ levels:
257
  # is symmetric across y=20 mid-latitude so both spawns face the
258
  # same engagement geometry.
259
  #
 
 
 
 
 
 
 
 
 
260
  # Per the CLAUDE.md `spawn_point` contract: ALL agent actors
261
  # carry an explicit spawn_point (the filter applies only to AGENT
262
  # actors); the enemy actors are unchanged and always place.
@@ -264,17 +275,17 @@ levels:
264
  description: >
265
  Four medium tanks (2tnk) stage at ONE of two safe-zone
266
  corridors (NORTH at x=5, y=8..11 OR SOUTH at x=5, y=28..31,
267
- chosen by seed — anti-memorisation). The enemy squad of FIVE
268
- rocket infantry (e3) escorting TWO heavy tanks (3tnk) holds
269
- the centre at (60, 20). The fight is unwinnable head-on.
 
270
  The intended play: march east-and-toward-centre into
271
- engagement range, focus-fire THREE enemy units (the cheap
272
- e3s die fast under concentrated tank fire), then move your
273
- tanks BACK to YOUR safe zone (the one you started in — read
274
- your start cell from obs) before attrition takes a second
275
- tank. Win when ≥3 enemies are killed AND ≥3 of your tanks
276
- are within radius 6 of YOUR safe zone (north (5,10) OR south
277
- (5,30)) AND you have ≥3 tanks alive, before tick 4500.
278
  Stall, brute attack-until-death, never-engage, and retreating
279
  to the WRONG safe zone all LOSE.
280
  overrides:
@@ -289,29 +300,26 @@ levels:
289
  - {type: 2tnk, owner: agent, position: [5, 29], stance: 1, spawn_point: 1}
290
  - {type: 2tnk, owner: agent, position: [5, 30], stance: 1, spawn_point: 1}
291
  - {type: 2tnk, owner: agent, position: [5, 31], stance: 1, spawn_point: 1}
292
- # FULL enemy squad — symmetric across y=20 so both spawns
293
- # face the same engagement geometry. e3 line forward at
294
- # x=60 (Dragon range 5); 3tnk escort 4 cells back at x=64
295
- # (out of weapon range from the e3 firing line so the heavy
296
- # only weighs in on a brute overshoot past the rocket line).
297
- - {type: e3, owner: enemy, position: [60, 17], stance: 3}
298
- - {type: e3, owner: enemy, position: [60, 18], stance: 3}
299
- - {type: e3, owner: enemy, position: [60, 19], stance: 3}
300
- - {type: e3, owner: enemy, position: [60, 21], stance: 3}
301
- - {type: e3, owner: enemy, position: [60, 22], stance: 3}
302
- - {type: 3tnk, owner: enemy, position: [64, 19], stance: 3}
303
- - {type: 3tnk, owner: enemy, position: [64, 21], stance: 3}
304
  - {type: fact, owner: enemy, position: [120, 20]}
305
  win_condition:
306
  all_of:
307
- - {units_killed_gte: 3}
308
  - any_of:
309
  - {units_in_region_gte: {x: 5, y: 10, radius: 6, n: 3}}
310
  - {units_in_region_gte: {x: 5, y: 30, radius: 6, n: 3}}
311
  - {own_units_gte: 3}
312
- - {within_ticks: 4500}
313
  fail_condition:
314
  any_of:
315
- - {after_ticks: 4501}
316
  - {not: {own_units_gte: 3}}
317
  max_turns: 51
 
11
  # trades hits for intel/attrition, then breaks contact before the
12
  # enemy can encircle and destroy.
13
  #
14
+ # Recalibrated 2026-05-20 after the OpenRA-Rust movement fixes
15
+ # (moving units fire AND take fire en route; attack_unit on an
16
+ # out-of-sight target paths normally). Those fixes made the close-
17
+ # range trade far more lethal, and the interrupt-mode tick cadence
18
+ # means an action-heavy episode advances FEWER ticks per turn than a
19
+ # pure stall — so the old after_ticks:4501 deadline was inert for the
20
+ # intended policy (a non-winning run DREW instead of LOSING). The
21
+ # enemy is now lighter and the deadline pulled down to 4000 (reached
22
+ # by every policy inside max_turns=51; verified no draw).
23
+ #
24
  # Idiom (the four-policy bar — same shape used by combat-kite and
25
  # combat-focus-fire):
26
+ # • stall (only observe) → LOSS. The kill bar is
27
+ # never met; the after_ticks:4000 deadline (reached ~turn 45)
28
+ # firesreal LOSS, never a draw.
29
+ # brute attack-until-death (commit fully) LOSS. An attack_move
30
+ # column overshoots the e3 line into the set-back 3tnk heavy and
31
+ # is out-traded; `own_units_gte:N` flips to fail before the
32
+ # survivors can pull back.
33
+ # • never-engage (sit at safe zone) → LOSS. The tanks stay
34
+ # in the safe zone the whole episode but the kill bar is unmet →
35
+ # after_ticks LOSS.
36
  # • intended engage-then-retreat → WIN. March east into
37
+ # fire range, attack_unit the soft e3 rocket infantry (the kill-
38
+ # bar fodder), then the instant a tank is lost OR any tank's HP
39
+ # drops below a floor, issue a move_units order WEST back to the
40
+ # safe-zone region. The bar is own_units_gte:N (lose at most one
41
+ # tank on medium/hard, up to two on easy) AND ≥3 tanks back in
 
42
  # the safe-zone radius. The decision under test is DISENGAGE
43
  # TIMING: bail BEFORE attrition busts the force-preservation cap.
44
  #
45
+ # Why "battle is lethal head-on" (verified vs engine combat sheet):
46
+ # • The e3 rocket infantry (Dragon launcher, anti-armour) alpha-
47
+ # strike the column hard a brute attack_move that closes past
48
+ # the e3 line into the set-back 3tnk's cannon range is out-traded
49
+ # and the loss cap flips before the kill bar is met.
50
+ # The "engage the e3 line break contact" play kills the cheap
51
+ # targets under concentrated tank fire then bails the moment the
52
+ # trade turns keeping ≥3 tanks in the loss cap.
 
 
 
 
53
  #
54
  # Engine guardrails:
55
  # • Map: rush-hour-arena (128 × 40, playable x in [2..126],
56
  # y in [2..38]). Tanks stage at the west edge (x=5..7), enemies
57
+ # at (60,20). Cells verified placeable per CLAUDE.md.
 
 
58
  # • Enemy bot is `guard` (not `hunt`): guard holds post and lunges
59
+ # within GUARD_AGGRO~16, snaps back past leash 18. The e3 actors
60
+ # additionally carry stance:2 Defend so they auto-fire in range
61
+ # but never advance they HOLD the firing line and do NOT chase
62
+ # the retreating column (a stance:3 e3 would hunt the tanks all
63
+ # the way home and confound the retreat). This also means the
64
+ # never-engage policy is not punished by enemy approach, so
65
+ # never-engage is a CLEAN kill-bar LOSS, not a confounded loss.
66
  # • Persistent unarmed `fact` far east (x=120, y=20) prevents engine
67
  # auto-done on enemy-elimination collapsing the run to DRAW — if
68
  # the agent accidentally kills everything, the run still evaluates
69
  # the in-region predicate at the safe zone (and the agent likely
70
  # hasn't retreated → fail clause fires on the clock or attrition).
71
+ # • after_ticks 4000 is reached by every policy inside max_turns=51
72
+ # (a pure stall crosses tick 4000 at ~turn 45; an action-heavy
73
+ # policy, which advances fewer ticks per turn in interrupt mode,
74
+ # still crosses it before turn 51) → a non-winning run is a real
75
+ # LOSS, never a draw.
76
  # • units_lost / units_killed are signal-level integers (combat-
77
  # focus-fire idiom); units_in_region_gte is the geometry predicate
78
  # established by action-multiunit-coordination / artofwar-lure-
 
129
 
130
  levels:
131
  # ── EASY ─────────────────────────────────────────────────────────
132
+ # Bare retreat skill. Recalibrated 2026-05-20 after the OpenRA-Rust
133
+ # movement fixes (moving units fire AND take fire en route; attack_
134
+ # unit on out-of-sight targets paths normally). Those fixes made the
135
+ # close-range trade far more lethal — the old 4×e3+1×3tnk enemy was
136
+ # unwinnable WITHIN the loss cap (killing the quota required losing
137
+ # too many tanks), and the inert after_ticks deadline let a
138
+ # non-winning run DRAW instead of LOSE.
139
+ # New shape: enemy is 2× e3 (anti-tank rockets, stance:2 Defend so
140
+ # they HOLD the line, not chase the retreat) on the firing line plus
141
+ # ONE 3tnk heavy escort set back at x=64. Loss cap forgiving
142
+ # (own_units_gte:2 ⇒ lose up to 2 tanks); kill bar 2 (kill both
143
+ # e3s). The engage-then-retreat play kills the e3 line and pulls
144
+ # back losing ~1 tank; the brute overcommits past the line into the
145
+ # heavy and is wiped; stall / never-engage never meet the kill bar.
146
  easy:
147
  description: >
148
  Four medium tanks (2tnk) stage at the safe zone in the west
149
+ (x=5, y=18..21). An enemy squad of TWO rocket infantry (e3,
150
+ anti-tank Dragon launcher) holds the firing line at x=60,
151
+ escorting ONE heavy tank (3tnk) set back at x=64. The fight is
152
+ lethal head-on the rocket infantry shred armour and the 3tnk
153
+ cannon out-trades the column once it overshoots into close
154
+ range. The intended play: march east into engagement range,
155
+ focus-fire and kill the TWO e3s, then move your tanks WEST BACK
156
+ to the safe zone before attrition busts the loss cap. Win when
157
+ ≥2 enemies are killed AND ≥3 of your tanks are within radius 6
158
+ of (5,20) AND you have ≥2 tanks alive, before tick 4000. Stall,
159
+ brute attack-until-death, and never-engage all LOSE.
 
 
160
  overrides:
161
  actors:
162
  # Strike force — 4 medium tanks at the safe-zone start.
 
164
  - {type: 2tnk, owner: agent, position: [5, 19], stance: 1}
165
  - {type: 2tnk, owner: agent, position: [5, 20], stance: 1}
166
  - {type: 2tnk, owner: agent, position: [5, 21], stance: 1}
167
+ # Enemy squad — 2× e3 anti-tank rockets ON the firing line at
168
+ # x=60 (the kill bar's soft targets), stance:2 Defend so they
169
+ # HOLD the line and auto-fire in range but do NOT chase the
170
+ # retreating column (a stance:3 e3 would hunt the tanks all
171
+ # the way home and confound the retreat). The 1× 3tnk heavy
172
+ # escort sits 4 cells BACK at x=64 out of weapon range from
173
+ # the e3 firing line, so it only weighs in when the agent
174
+ # commits PAST the e3 line (the brute overshoot), supplying
175
+ # the attrition that makes the brute LOSE.
176
+ - {type: e3, owner: enemy, position: [60, 19], stance: 2}
177
+ - {type: e3, owner: enemy, position: [60, 21], stance: 2}
178
+ - {type: 3tnk, owner: enemy, position: [64, 20], stance: 2}
 
 
 
179
  # Persistent far-east enemy fact — prevents engine auto-done
180
  # on enemy-elimination so the safe-zone predicate is
181
  # evaluated rather than the run collapsing to DRAW.
 
185
  - {units_killed_gte: 2}
186
  - {units_in_region_gte: {x: 5, y: 20, radius: 6, n: 3}}
187
  - {own_units_gte: 2}
188
+ - {within_ticks: 4000}
189
  fail_condition:
190
  any_of:
191
+ - {after_ticks: 4000}
192
  - {not: {own_units_gte: 2}}
193
  max_turns: 51
194
 
195
  # ── MEDIUM ───────────────────────────────────────────────────────
196
+ # +1 controlled variable vs easy: a bigger e3 line (3× e3 instead of
197
+ # 2×, kill bar 3 instead of 2) AND a tighter survival bar
198
+ # (own_units_gte:3 lose AT MOST ONE tank). The engage window is
199
+ # shorter three e3s alpha-strike the column harder, so the agent
200
+ # must focus-fire efficiently and break contact a turn sooner. The
201
+ # intended engage-then-retreat still wins (focus-fire the 3 e3s,
202
+ # pull back losing one tank); the brute overcommits past the line
203
+ # into the heavy and loses the force.
204
  medium:
205
  description: >
206
  Four medium tanks (2tnk) stage at the safe zone in the west
207
+ (x=5, y=18..21). The enemy squad is THREE rocket infantry (e3,
208
+ anti-tank Dragon launcher) holding the firing line at x=60,
209
+ escorting ONE heavy tank (3tnk) set back at x=64. The fight is
210
+ lethal head-on the rockets shred armour and the 3tnk cannon
211
+ out-trades the column once it overshoots. The intended play:
212
+ march east into engagement range, focus-fire and kill the
213
+ THREE e3s, then move your tanks WEST BACK to the safe zone
214
+ before attrition takes a second tank. Win when ≥3 enemies are
215
+ killed AND ≥3 of your tanks are within radius 6 of (5,20) AND
216
+ you have ≥3 tanks alive, before tick 4000. Stall, brute
217
+ attack-until-death, and never-engage all LOSE.
 
218
  overrides:
219
  actors:
220
  - {type: 2tnk, owner: agent, position: [5, 18], stance: 1}
221
  - {type: 2tnk, owner: agent, position: [5, 19], stance: 1}
222
  - {type: 2tnk, owner: agent, position: [5, 20], stance: 1}
223
  - {type: 2tnk, owner: agent, position: [5, 21], stance: 1}
224
+ # Enemy squad — 3× e3 (kill-bar fodder; anti-tank Dragon
225
+ # launcher) ON THE LINE at x=60, stance:2 Defend so they hold
226
+ # the line and do not chase the retreat. The 3tnk heavy
227
+ # escort sits 4 cells back at x=64, out of weapon range from
228
+ # the e3 firing line it only weighs in when the agent
229
+ # commits PAST the e3 line (the brute overshoot). Spread the
230
+ # e3 line across y=18..22 so the squad faces fire.
231
+ - {type: e3, owner: enemy, position: [60, 18], stance: 2}
232
+ - {type: e3, owner: enemy, position: [60, 20], stance: 2}
233
+ - {type: e3, owner: enemy, position: [60, 22], stance: 2}
234
+ # Heavy tank 4 cells BEHIND the rocket line — the attrition
235
+ # trigger for the brute attack_move policy that closes past
236
+ # x=60 into 3tnk cannon range.
237
+ - {type: 3tnk, owner: enemy, position: [64, 20], stance: 2}
 
 
 
 
 
 
 
 
238
  - {type: fact, owner: enemy, position: [120, 20]}
239
  win_condition:
240
  all_of:
241
  - {units_killed_gte: 3}
242
  - {units_in_region_gte: {x: 5, y: 20, radius: 6, n: 3}}
243
  - {own_units_gte: 3}
244
+ - {within_ticks: 4000}
245
  fail_condition:
246
  any_of:
247
+ - {after_ticks: 4000}
248
  - {not: {own_units_gte: 3}}
249
  max_turns: 51
250
 
 
259
  # is symmetric across y=20 mid-latitude so both spawns face the
260
  # same engagement geometry.
261
  #
262
+ # The corner spawn is the hard challenge: the squad must close on
263
+ # the e3 line along a long DIAGONAL, taking fire the whole approach
264
+ # (the engine movement fix means a moving column is a live target).
265
+ # That diagonal makes the medium-tier 3-e3 line genuinely
266
+ # unsolvable inside the loss cap, so hard trades raw enemy count
267
+ # for positional difficulty — 2× e3 + 1× 3tnk, kill bar 2 — while
268
+ # the seed-flipped corridor and the read-your-own-safe-zone
269
+ # requirement supply the discrimination.
270
+ #
271
  # Per the CLAUDE.md `spawn_point` contract: ALL agent actors
272
  # carry an explicit spawn_point (the filter applies only to AGENT
273
  # actors); the enemy actors are unchanged and always place.
 
275
  description: >
276
  Four medium tanks (2tnk) stage at ONE of two safe-zone
277
  corridors (NORTH at x=5, y=8..11 OR SOUTH at x=5, y=28..31,
278
+ chosen by seed — anti-memorisation). An enemy squad of TWO
279
+ rocket infantry (e3) escorting ONE heavy tank (3tnk) holds
280
+ the centre at (60, 20). The fight is lethal head-on, and the
281
+ corner spawn means a long diagonal approach under fire.
282
  The intended play: march east-and-toward-centre into
283
+ engagement range, focus-fire and kill the TWO e3s, then move
284
+ your tanks BACK to YOUR safe zone (the one you started in —
285
+ read your start cell from obs) before attrition takes a
286
+ second tank. Win when ≥2 enemies are killed AND ≥3 of your
287
+ tanks are within radius 6 of YOUR safe zone (north (5,10) OR
288
+ south (5,30)) AND you have ≥3 tanks alive, before tick 4000.
 
289
  Stall, brute attack-until-death, never-engage, and retreating
290
  to the WRONG safe zone all LOSE.
291
  overrides:
 
300
  - {type: 2tnk, owner: agent, position: [5, 29], stance: 1, spawn_point: 1}
301
  - {type: 2tnk, owner: agent, position: [5, 30], stance: 1, spawn_point: 1}
302
  - {type: 2tnk, owner: agent, position: [5, 31], stance: 1, spawn_point: 1}
303
+ # Enemy squad — symmetric across y=20 so both spawns face the
304
+ # same engagement geometry. e3 line forward at x=60 (Dragon
305
+ # range 5), stance:2 Defend so the e3 hold the line and do
306
+ # not chase the retreat. The 3tnk escort sits 4 cells back
307
+ # at x=64, out of weapon range from the e3 firing line so the
308
+ # heavy only weighs in on a brute overshoot past the line.
309
+ - {type: e3, owner: enemy, position: [60, 19], stance: 2}
310
+ - {type: e3, owner: enemy, position: [60, 21], stance: 2}
311
+ - {type: 3tnk, owner: enemy, position: [64, 20], stance: 2}
 
 
 
312
  - {type: fact, owner: enemy, position: [120, 20]}
313
  win_condition:
314
  all_of:
315
+ - {units_killed_gte: 2}
316
  - any_of:
317
  - {units_in_region_gte: {x: 5, y: 10, radius: 6, n: 3}}
318
  - {units_in_region_gte: {x: 5, y: 30, radius: 6, n: 3}}
319
  - {own_units_gte: 3}
320
+ - {within_ticks: 4000}
321
  fail_condition:
322
  any_of:
323
+ - {after_ticks: 4000}
324
  - {not: {own_units_gte: 3}}
325
  max_turns: 51
tests/test_combat_retreat_after_engagement.py CHANGED
@@ -1,17 +1,38 @@
1
  """combat-retreat-after-engagement — disengage to preserve the force.
2
 
3
- Bar (four script-policy proxies):
4
- stall (observe only) → LOSS (kill bar unmet)
5
- brute attack-until-death → LOSS (loses too many tanks)
6
- never-engage (sit at safe zone) → LOSS (kill bar unmet)
7
- intended engage-then-retreat → WIN
8
-
9
- The "intended" policy is the spec's load-bearing decision: march east
10
- into engagement range, focus-fire e3 rocket infantry (the cheap kill-
11
- bar targets), and the instant the kill bar is met OR a tank is lost
12
- pull back to the safe-zone radius. The retreat trigger is the
13
- capability under test too early kill bar fails; too late
14
- attrition busts the survival bar.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
  """
16
 
17
  from __future__ import annotations
@@ -56,7 +77,8 @@ def _ctx(units_xy=(), tick=1000, killed=0, lost=0):
56
 
57
  def test_predicates_easy():
58
  c = compile_level(load_pack(PACK_PATH), "easy")
59
- # 3 tanks back in safe zone (5,20,r=6), killed 2 enemies, 1 lost, in time → WIN
 
60
  home3 = [(5, 18), (5, 20), (5, 21)]
61
  assert evaluate(c.win_condition, _ctx(home3, tick=3000, killed=2, lost=1))
62
  # Kill bar unmet (only 1 killed) → not WIN
@@ -68,9 +90,9 @@ def test_predicates_easy():
68
  # 3 tanks lost (only 1 alive) → fail clause own_units_gte:2 fires
69
  assert evaluate(c.fail_condition, _ctx([(5, 20)], tick=3000, killed=3, lost=3))
70
  # Past deadline → real LOSS reachable within max_turns
71
- assert evaluate(c.fail_condition, _ctx(home3, tick=4502, killed=0, lost=0))
72
- assert 4501 <= 93 + 90 * (c.max_turns - 1), (
73
- "easy after_ticks 4501 must be reachable within max_turns"
74
  )
75
 
76
 
@@ -90,27 +112,27 @@ def test_predicates_medium_force_preservation_bar():
90
  # 2 tanks alive ⇒ fail clause fires (preservation cap)
91
  assert evaluate(c.fail_condition, _ctx(home2, tick=3000, killed=3, lost=2))
92
  # Past deadline ⇒ real LOSS reachable
93
- assert evaluate(c.fail_condition, _ctx(home3, tick=4502, killed=0, lost=0))
94
- assert 4501 <= 93 + 90 * (c.max_turns - 1)
95
 
96
 
97
  def test_predicates_hard_two_safe_zones():
98
  c = compile_level(load_pack(PACK_PATH), "hard")
99
  # NORTH safe zone (5,10) satisfies the any_of geometry
100
  home_north = [(5, 9), (5, 10), (5, 11)]
101
- assert evaluate(c.win_condition, _ctx(home_north, tick=3000, killed=3, lost=1))
102
  # SOUTH safe zone (5,30) also satisfies the any_of geometry
103
  home_south = [(5, 29), (5, 30), (5, 31)]
104
- assert evaluate(c.win_condition, _ctx(home_south, tick=3000, killed=3, lost=1))
105
  # Tanks at the WRONG centre (5,20) — outside BOTH safe zones at r=6
106
  # ((5,20)-(5,10)=10>6 and (5,20)-(5,30)=10>6) → fails the geometry
107
  assert not evaluate(
108
  c.win_condition,
109
- _ctx([(5, 20), (5, 19), (5, 21)], tick=3000, killed=3, lost=1),
110
  )
111
- # Past tighter deadline → real LOSS reachable
112
- assert evaluate(c.fail_condition, _ctx(home_north, tick=4502, killed=0, lost=0))
113
- assert 4501 <= 93 + 90 * (c.max_turns - 1)
114
 
115
 
116
  def test_hard_has_two_spawn_point_groups():
@@ -143,13 +165,16 @@ def test_pack_compiles_and_meta_fields_populated():
143
 
144
 
145
  def test_timeout_loss_is_reachable_on_every_level():
146
- """No draw degeneracy: the after_ticks deadline fits inside
147
- max_turns on every level (∼90 ticks/turn ⇒ 93 + 90·(max_turns-1))."""
 
 
 
148
  pack = load_pack(PACK_PATH)
149
  for lvl in ("easy", "medium", "hard"):
150
  c = compile_level(pack, lvl)
151
- assert 4501 <= 93 + 90 * (c.max_turns - 1), (
152
- f"{lvl}: after_ticks 4501 not reachable within max_turns"
153
  )
154
 
155
 
@@ -158,18 +183,22 @@ def test_timeout_loss_is_reachable_on_every_level():
158
  # The four-policy bar. All engine-driven tests guard on the Rust env
159
  # wheel; predicate-level tests above run without it.
160
 
 
 
 
 
 
161
 
162
  def _stall_policy(rs, Command):
163
- """Stall: only observe. Kill bar never met → after_ticks LOSS."""
164
  return [Command.observe()]
165
 
166
 
167
  def _brute_attack_until_death_policy(rs, Command):
168
  """Brute: attack_move toward the enemy centre and never retreat.
169
  The column overshoots the e3 firing line into the set-back 3tnk
170
- heavy escort (x=64 on every tier easy 3tnk, medium/hard
171
- 3tnk); the heavy + rocket mass alpha out-trades 2tnk and
172
- the column dies before clearing the squad → own_units_gte:N
173
  fails on every level."""
174
  units = rs.get("units_summary", []) or []
175
  if not units:
@@ -197,24 +226,28 @@ def _never_engage_policy(rs, Command):
197
 
198
 
199
  def _make_intended_engage_then_retreat():
200
- """Intended policy (the spec's load-bearing decision): march to
201
- the engagement axis, focus-fire e3 rocket infantry, and the
202
- instant a tank is lost OR ≥3 enemies are observed killed pull
203
- back to the safe-zone radius (detected from the agent's spawn
204
- median-y latched on first observation). Stateful uses a
205
- closure to track the peak number of visible killables (so we
206
- can infer kills from the shrink without reading
207
- signals.units_killed)."""
208
- state = {"peak_visible": 0, "retreat_latched": False, "home_y": None}
 
 
 
 
 
 
209
 
210
  def pol(rs, Command):
211
  units = rs.get("units_summary", []) or []
212
  enemies = rs.get("enemy_summary", []) or []
213
  if not units:
214
  return [Command.observe()]
215
- # Latch the home Y on first observation. The agent's spawn
216
- # cell median y resolves to one of the three safe-zone
217
- # corridors (north y=10, centre y=20, south y=30).
218
  if state["home_y"] is None:
219
  ys = sorted(u["cell_y"] for u in units)
220
  hy_med = ys[len(ys) // 2]
@@ -226,43 +259,38 @@ def _make_intended_engage_then_retreat():
226
  state["home_y"] = 20
227
  hy = state["home_y"]
228
  n_alive = len(units)
 
229
  killable = [
230
  e
231
  for e in enemies
232
  if not e.get("is_building")
233
  and (e.get("type") or "").lower() != "fact"
234
  ]
235
- visible = len(killable)
236
- if visible > state["peak_visible"]:
237
- state["peak_visible"] = visible
238
- killed_observed = state["peak_visible"] - visible
239
- # RETREAT TRIGGER: latched, or any tank lost, or ≥3 enemies
240
- # observed killed. Once retreating, stay retreating (a re-
241
- # engagement would re-expose the survivors to attrition).
242
- if state["retreat_latched"] or n_alive < 4 or killed_observed >= 3:
243
- state["retreat_latched"] = True
244
  return [
245
  Command.move_units([str(u["id"])], target_x=5, target_y=hy)
246
  for u in units
247
  ]
248
- # ENGAGE: pick the closest e3 to home and focus-fire it with
249
- # ALL tanks (4-vs-1 ends a Dragon-soldier in 1-2 decision turns).
250
- e3s = [e for e in killable if (e.get("type") or "").lower() == "e3"]
251
- if e3s:
252
  e3s.sort(
253
- key=lambda e: (e["cell_x"] - 5) ** 2 + (e["cell_y"] - hy) ** 2
254
  )
255
  t = e3s[0]
256
  return [
257
  Command.attack_unit([str(u["id"])], str(t["id"])) for u in units
258
  ]
259
- # APPROACH: advance toward the engagement axis (50, 20) so
260
- # the spawn corridor (y=10 or y=30 on hard) closes onto the
261
- # mid-latitude line where the e3s will come into view.
262
  return [
263
  Command.move_units(
264
  [str(u["id"])],
265
- target_x=min(50, u["cell_x"] + 12),
266
  target_y=20,
267
  )
268
  for u in units
@@ -273,13 +301,13 @@ def _make_intended_engage_then_retreat():
273
 
274
  @pytest.mark.parametrize("level", ["easy", "medium", "hard"])
275
  def test_stall_policy_loses(level):
276
- """Stall must LOSE on every level — kill bar unmet → after_ticks LOSS."""
 
277
  pytest.importorskip("openra_train")
278
  from openra_bench.eval_core import run_level
279
 
280
  c = compile_level(load_pack(PACK_PATH), level)
281
- seeds = (1, 2, 3, 4) if level == "hard" else (1,)
282
- for s in seeds:
283
  res = run_level(c, _stall_policy, seed=s)
284
  assert res.outcome == "loss", (
285
  f"{level} seed={s}: stall must LOSE; got {res.outcome} "
@@ -289,14 +317,14 @@ def test_stall_policy_loses(level):
289
 
290
  @pytest.mark.parametrize("level", ["easy", "medium", "hard"])
291
  def test_brute_attack_until_death_loses(level):
292
- """Brute attack-until-death must LOSE — the mass alpha at the
293
- enemy line out-trades the column before the bar is met."""
 
294
  pytest.importorskip("openra_train")
295
  from openra_bench.eval_core import run_level
296
 
297
  c = compile_level(load_pack(PACK_PATH), level)
298
- seeds = (1, 2, 3, 4) if level == "hard" else (1,)
299
- for s in seeds:
300
  res = run_level(c, _brute_attack_until_death_policy, seed=s)
301
  assert res.outcome == "loss", (
302
  f"{level} seed={s}: brute must LOSE; got {res.outcome} "
@@ -312,8 +340,7 @@ def test_never_engage_policy_loses(level):
312
  from openra_bench.eval_core import run_level
313
 
314
  c = compile_level(load_pack(PACK_PATH), level)
315
- seeds = (1, 2, 3, 4) if level == "hard" else (1,)
316
- for s in seeds:
317
  res = run_level(c, _never_engage_policy, seed=s)
318
  assert res.outcome == "loss", (
319
  f"{level} seed={s}: never-engage must LOSE; got {res.outcome} "
@@ -324,14 +351,15 @@ def test_never_engage_policy_loses(level):
324
  @pytest.mark.parametrize("level", ["easy", "medium", "hard"])
325
  def test_intended_engage_then_retreat_wins(level):
326
  """Intended engage-then-retreat must WIN on every level and every
327
- hard seed (1..4): focus-fire e3s, retreat the instant a tank is
328
- lost or ≥3 kills observed, end with ≥3 tanks in the safe zone."""
 
 
329
  pytest.importorskip("openra_train")
330
  from openra_bench.eval_core import run_level
331
 
332
  c = compile_level(load_pack(PACK_PATH), level)
333
- seeds = (1, 2, 3, 4) if level == "hard" else (1,)
334
- for s in seeds:
335
  pol = _make_intended_engage_then_retreat()
336
  res = run_level(c, pol, seed=s)
337
  assert res.outcome == "win", (
 
1
  """combat-retreat-after-engagement — disengage to preserve the force.
2
 
3
+ Bar (recalibrated 2026-05-20 after the OpenRA-Rust engine movement
4
+ fixes moving units fire AND take fire en route, and attack_unit on
5
+ an out-of-sight target paths normally). Those fixes made the close-
6
+ range trade far more lethal: the old 4-5 e3 + 1-2 3tnk enemy was
7
+ unwinnable inside the loss cap (killing the quota required losing too
8
+ many tanks), and — because interrupt mode advances FEWER ticks per
9
+ turn for an action-heavy policy than for a pure stall the old
10
+ after_ticks:4501 deadline was inert for the intended policy, so a
11
+ non-winning run DREW instead of LOSING. Recalibration: lighter enemy
12
+ (easy 2 e3 + 1 3tnk, medium 3 e3 + 1 3tnk, hard 2 e3 + 1 3tnk; e3 at
13
+ stance:2 Defend so they hold the line and do not chase the retreat),
14
+ deadline pulled down to 4000 (reached by every policy inside
15
+ max_turns).
16
+
17
+ The four script-policy proxies, every level, seeds 1-4:
18
+
19
+ • stall (observe only) → LOSS — kill bar never met; the
20
+ after_ticks:4000 deadline (reached ~turn 45) fires → real LOSS.
21
+ • brute attack-until-death → LOSS — the attack_move column
22
+ overshoots the e3 line into the set-back 3tnk and is out-traded;
23
+ loses too many tanks before the bar is met.
24
+ • never-engage (sit at safe zone) → LOSS — ≥3 tanks survive in the
25
+ safe zone but the kill bar is never met → after_ticks LOSS.
26
+ • intended engage-then-retreat → WIN — march to the engagement
27
+ line, focus-fire the e3 rocket infantry, and the instant a tank
28
+ is lost OR any tank's HP drops below a floor pull back to the
29
+ safe zone. End with ≥3 tanks in the safe zone and the kill bar
30
+ met.
31
+
32
+ The "intended" policy is the spec's load-bearing decision: the
33
+ retreat trigger (HP-floor / tank-lost) is the capability under test —
34
+ too late ⇒ attrition busts the survival bar; never engaging ⇒ the
35
+ kill bar fails.
36
  """
37
 
38
  from __future__ import annotations
 
77
 
78
  def test_predicates_easy():
79
  c = compile_level(load_pack(PACK_PATH), "easy")
80
+ # 3 tanks back in safe zone (5,20,r=6), killed 2 enemies, 1 lost,
81
+ # in time → WIN
82
  home3 = [(5, 18), (5, 20), (5, 21)]
83
  assert evaluate(c.win_condition, _ctx(home3, tick=3000, killed=2, lost=1))
84
  # Kill bar unmet (only 1 killed) → not WIN
 
90
  # 3 tanks lost (only 1 alive) → fail clause own_units_gte:2 fires
91
  assert evaluate(c.fail_condition, _ctx([(5, 20)], tick=3000, killed=3, lost=3))
92
  # Past deadline → real LOSS reachable within max_turns
93
+ assert evaluate(c.fail_condition, _ctx(home3, tick=4002, killed=0, lost=0))
94
+ assert 4000 <= 93 + 90 * (c.max_turns - 1), (
95
+ "easy after_ticks 4000 must be reachable within max_turns"
96
  )
97
 
98
 
 
112
  # 2 tanks alive ⇒ fail clause fires (preservation cap)
113
  assert evaluate(c.fail_condition, _ctx(home2, tick=3000, killed=3, lost=2))
114
  # Past deadline ⇒ real LOSS reachable
115
+ assert evaluate(c.fail_condition, _ctx(home3, tick=4002, killed=0, lost=0))
116
+ assert 4000 <= 93 + 90 * (c.max_turns - 1)
117
 
118
 
119
  def test_predicates_hard_two_safe_zones():
120
  c = compile_level(load_pack(PACK_PATH), "hard")
121
  # NORTH safe zone (5,10) satisfies the any_of geometry
122
  home_north = [(5, 9), (5, 10), (5, 11)]
123
+ assert evaluate(c.win_condition, _ctx(home_north, tick=3000, killed=2, lost=1))
124
  # SOUTH safe zone (5,30) also satisfies the any_of geometry
125
  home_south = [(5, 29), (5, 30), (5, 31)]
126
+ assert evaluate(c.win_condition, _ctx(home_south, tick=3000, killed=2, lost=1))
127
  # Tanks at the WRONG centre (5,20) — outside BOTH safe zones at r=6
128
  # ((5,20)-(5,10)=10>6 and (5,20)-(5,30)=10>6) → fails the geometry
129
  assert not evaluate(
130
  c.win_condition,
131
+ _ctx([(5, 20), (5, 19), (5, 21)], tick=3000, killed=2, lost=1),
132
  )
133
+ # Past the deadline → real LOSS reachable
134
+ assert evaluate(c.fail_condition, _ctx(home_north, tick=4002, killed=0, lost=0))
135
+ assert 4000 <= 93 + 90 * (c.max_turns - 1)
136
 
137
 
138
  def test_hard_has_two_spawn_point_groups():
 
165
 
166
 
167
  def test_timeout_loss_is_reachable_on_every_level():
168
+ """No draw degeneracy: the after_ticks:4000 deadline fits inside
169
+ max_turns on every level (∼90 ticks/turn ⇒ 93 + 90·(max_turns-1)),
170
+ and — verified by the engine-driven tests below — is actually
171
+ crossed by every policy, including an action-heavy one running in
172
+ interrupt mode."""
173
  pack = load_pack(PACK_PATH)
174
  for lvl in ("easy", "medium", "hard"):
175
  c = compile_level(pack, lvl)
176
+ assert 4000 <= 93 + 90 * (c.max_turns - 1), (
177
+ f"{lvl}: after_ticks 4000 not reachable within max_turns"
178
  )
179
 
180
 
 
183
  # The four-policy bar. All engine-driven tests guard on the Rust env
184
  # wheel; predicate-level tests above run without it.
185
 
186
+ # Retreat the instant any tank's HP drops below this floor; ENGAGE_X
187
+ # is the x-line the squad closes to before opening fire.
188
+ RETREAT_HP_FLOOR = 0.5
189
+ ENGAGE_X = 54
190
+
191
 
192
  def _stall_policy(rs, Command):
193
+ """Stall: only observe. Kill bar never met → after_ticks:4000 LOSS."""
194
  return [Command.observe()]
195
 
196
 
197
  def _brute_attack_until_death_policy(rs, Command):
198
  """Brute: attack_move toward the enemy centre and never retreat.
199
  The column overshoots the e3 firing line into the set-back 3tnk
200
+ heavy escort (x=64); the heavy + rocket fire out-trades the column
201
+ and it loses too many tanks before the bar is met → own_units_gte:N
 
202
  fails on every level."""
203
  units = rs.get("units_summary", []) or []
204
  if not units:
 
226
 
227
 
228
  def _make_intended_engage_then_retreat():
229
+ """Intended policy (the spec's load-bearing decision), in three
230
+ phases driven purely by the per-turn observation no fragile
231
+ kill-count inference:
232
+
233
+ 1. APPROACH march all tanks to the engagement line
234
+ (ENGAGE_X, 20). The home safe-zone latitude is latched from
235
+ the spawn cell median y on the first observation (north y=10,
236
+ centre y=20, south y=30).
237
+ 2. ENGAGE once the squad is at the line, focus-fire the
238
+ nearest e3 rocket soldier with ALL tanks.
239
+ 3. RETREAT (latched) — the instant a tank is lost OR any tank's
240
+ HP drops below RETREAT_HP_FLOOR, pull every tank back to the
241
+ home safe zone and stay there. The HP-floor / tank-lost
242
+ trigger is the disengage-timing decision under test."""
243
+ state = {"latched": False, "home_y": None}
244
 
245
  def pol(rs, Command):
246
  units = rs.get("units_summary", []) or []
247
  enemies = rs.get("enemy_summary", []) or []
248
  if not units:
249
  return [Command.observe()]
250
+ # Latch the home Y on first observation.
 
 
251
  if state["home_y"] is None:
252
  ys = sorted(u["cell_y"] for u in units)
253
  hy_med = ys[len(ys) // 2]
 
259
  state["home_y"] = 20
260
  hy = state["home_y"]
261
  n_alive = len(units)
262
+ min_hp = min((u.get("hp", 1.0) for u in units), default=1.0)
263
  killable = [
264
  e
265
  for e in enemies
266
  if not e.get("is_building")
267
  and (e.get("type") or "").lower() != "fact"
268
  ]
269
+ e3s = [e for e in killable if (e.get("type") or "").lower() == "e3"]
270
+ # RETREAT TRIGGER: latched, a tank lost, or any tank below the
271
+ # HP floor. Once retreating, stay retreating.
272
+ if state["latched"] or n_alive < 4 or min_hp <= RETREAT_HP_FLOOR:
273
+ state["latched"] = True
 
 
 
 
274
  return [
275
  Command.move_units([str(u["id"])], target_x=5, target_y=hy)
276
  for u in units
277
  ]
278
+ # ENGAGE: once the whole squad is at the line, focus-fire the
279
+ # nearest e3 with ALL tanks.
280
+ at_line = all(u["cell_x"] >= ENGAGE_X - 4 for u in units)
281
+ if e3s and at_line:
282
  e3s.sort(
283
+ key=lambda e: (e["cell_x"] - 5) ** 2 + (e["cell_y"] - 20) ** 2
284
  )
285
  t = e3s[0]
286
  return [
287
  Command.attack_unit([str(u["id"])], str(t["id"])) for u in units
288
  ]
289
+ # APPROACH: advance toward the engagement line at y=20.
 
 
290
  return [
291
  Command.move_units(
292
  [str(u["id"])],
293
+ target_x=min(ENGAGE_X, u["cell_x"] + 12),
294
  target_y=20,
295
  )
296
  for u in units
 
301
 
302
  @pytest.mark.parametrize("level", ["easy", "medium", "hard"])
303
  def test_stall_policy_loses(level):
304
+ """Stall must LOSE on every level — kill bar unmet → the
305
+ after_ticks:4000 deadline fires (real LOSS, never a draw)."""
306
  pytest.importorskip("openra_train")
307
  from openra_bench.eval_core import run_level
308
 
309
  c = compile_level(load_pack(PACK_PATH), level)
310
+ for s in (1, 2, 3, 4):
 
311
  res = run_level(c, _stall_policy, seed=s)
312
  assert res.outcome == "loss", (
313
  f"{level} seed={s}: stall must LOSE; got {res.outcome} "
 
317
 
318
  @pytest.mark.parametrize("level", ["easy", "medium", "hard"])
319
  def test_brute_attack_until_death_loses(level):
320
+ """Brute attack-until-death must LOSE — the column overshoots the
321
+ e3 line into the set-back 3tnk and is out-traded before the bar
322
+ is met."""
323
  pytest.importorskip("openra_train")
324
  from openra_bench.eval_core import run_level
325
 
326
  c = compile_level(load_pack(PACK_PATH), level)
327
+ for s in (1, 2, 3, 4):
 
328
  res = run_level(c, _brute_attack_until_death_policy, seed=s)
329
  assert res.outcome == "loss", (
330
  f"{level} seed={s}: brute must LOSE; got {res.outcome} "
 
340
  from openra_bench.eval_core import run_level
341
 
342
  c = compile_level(load_pack(PACK_PATH), level)
343
+ for s in (1, 2, 3, 4):
 
344
  res = run_level(c, _never_engage_policy, seed=s)
345
  assert res.outcome == "loss", (
346
  f"{level} seed={s}: never-engage must LOSE; got {res.outcome} "
 
351
  @pytest.mark.parametrize("level", ["easy", "medium", "hard"])
352
  def test_intended_engage_then_retreat_wins(level):
353
  """Intended engage-then-retreat must WIN on every level and every
354
+ seed (1..4): march to the engagement line, focus-fire the e3s,
355
+ retreat the instant a tank is lost or any tank's HP drops below
356
+ the floor, end with ≥3 tanks in the safe zone and the kill bar
357
+ met (recalibrated 2026-05-20: killed 2-3, lost 1)."""
358
  pytest.importorskip("openra_train")
359
  from openra_bench.eval_core import run_level
360
 
361
  c = compile_level(load_pack(PACK_PATH), level)
362
+ for s in (1, 2, 3, 4):
 
363
  pol = _make_intended_engage_then_retreat()
364
  res = run_level(c, pol, seed=s)
365
  assert res.outcome == "win", (