yxc20098 commited on
Commit
7604738
·
1 Parent(s): f493054

feat(scenario): combat-kite-and-pull — hit-and-pull kiting micro vs a slow heavy (SC2 kiting micro)

Browse files

ACTION pack: fast 2tnk raiders must strike-then-PULL a hunting 3tnk
heavy — fire at range, retreat out of the heavy's lethal close-range
window, repeat. Stand-and-fight, brute, and stall all LOSE; only the
move-away + attack_unit kite cycle WINS. Medium/hard tighten the bar
to a perfect pull (all three raiders survive). Hard defines two
seed-driven spawn_point corridors.

openra_bench/scenarios/packs/combat-kite-and-pull.yaml ADDED
@@ -0,0 +1,254 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # combat-kite-and-pull — ACTION, kiting micro: hit-and-PULL a slow
2
+ # heavy enemy with a fast light strike force (Wave-12).
3
+ #
4
+ # The capability under test is the STRIKE-then-PULL cycle: a fast
5
+ # light unit closes to weapon range, FIRES, then retreats out of the
6
+ # heavy's lethal close-range window BEFORE the heavy can fire back —
7
+ # and repeats. The heavy out-trades the light force head-on, so
8
+ # standing and fighting LOSES; the light force's speed advantage is
9
+ # the only edge, and it only pays off if the agent strings together
10
+ # the move-away + attack_unit cycle every turn.
11
+ #
12
+ # Real-world anchors:
13
+ # • SC2 kiting micro (vulture/muta vs marines, stalker-vs-zealot):
14
+ # fast unit fires, steps back before the slower foe closes,
15
+ # re-engages. The "blink-back" / micro-dance.
16
+ # • Cavalry skirmish doctrine: light cavalry charges to contact,
17
+ # looses a volley, then WHEELS AWAY before the heavier line can
18
+ # engage — fire-and-maneuver, never a sustained melee.
19
+ # • Economy-of-force: a small mobile force defeats a heavier,
20
+ # concentrated foe by exploiting a mobility asymmetry rather
21
+ # than by mass.
22
+ #
23
+ # Distinct from `combat-kite-jeep-vs-tank` (Wave-4): that pack frames
24
+ # the trade as "preserve ≥2 of 3"; this pack tightens the bar to a
25
+ # perfect PULL — medium and hard require ALL THREE raiders to survive
26
+ # (`own_units_gte:3` — a sloppy kite that trades even one raider for
27
+ # the heavy LOSES) and carries an explicit "no-disengage" brute /
28
+ # wrong-path policy in its four-policy bar. The shared idiom —
29
+ # diagonal-lag geometry, hunt-bot heavy, the move-away + attack_unit
30
+ # cycle — is the proven engine-realised kite test
31
+ # (`combat-kite-jeep-vs-tank` medium/hard).
32
+ #
33
+ # Engine-realised pairing note: in OpenRA-Rust the literal jeep MG is
34
+ # anti-infantry and does not dent heavy armour (engine weapons
35
+ # table), so the fast "raider" is the allied medium tank 2tnk
36
+ # (faster than the soviet 3tnk heavy, and its cannon CAN damage heavy
37
+ # armour). The capability under test is the kite-and-pull cycle —
38
+ # the unit pairing is the vehicle for that test, not the point.
39
+ #
40
+ # The four-policy bar (CLAUDE.md "no defect, no cheat, no draw"):
41
+ # • stall (observe only) → LOSS. The raiders are
42
+ # stance:1 (ReturnFire) — they auto-return fire ONLY after the
43
+ # heavy shoots them, but a passive stack that never kites is
44
+ # out-traded and overrun by the closing heavy force → the
45
+ # survival bar fails and/or the `after_ticks` deadline bites.
46
+ # • stand-and-fight (attack_move onto the heavy, never retreat)
47
+ # → LOSS. The heavy cannon out-trades the raider stack head-on;
48
+ # the raiders die before the heavy's HP runs out → the survival
49
+ # bar (own_units_gte) fails.
50
+ # • brute / wrong-path (one attack_move order far east, chase the
51
+ # heavy with no disengage) → LOSS. Same close-range trade as
52
+ # stand-and-fight; no kite cycle ⇒ the raiders are overrun.
53
+ # • intended kite-and-pull (each turn: if the heavy is within ~6
54
+ # cells, MOVE the raiders AWAY along the lane; else attack_unit
55
+ # the heavy; repeat) → WIN. The speed advantage keeps the heavy
56
+ # at the edge of the raiders' fire envelope, whittling it down
57
+ # across fire-then-retreat cycles while preserving the survival
58
+ # bar.
59
+ #
60
+ # Topology (rush-hour-arena, 128×40, playable x 2..126, y 2..38):
61
+ # • Raiders stage centre-west, spread across three cells (not
62
+ # stacked — a stack pin-piles in retreat).
63
+ # • The heavy starts centre-east on the MID latitude (y=20) under
64
+ # the hunt bot so it pursues the raiders' centroid — the agent
65
+ # must out-pace it, and the hunt advance is what brings the
66
+ # heavy into vision so the kite cycle has a target.
67
+ # • Raiders stage OFF the heavy's latitude (a north corridor on
68
+ # easy/medium) so the kite cycle has a reactive y-axis window.
69
+ # • Persistent unarmed enemy `fact` far east keeps the episode
70
+ # alive past the heavy's death so the win/fail evaluator runs
71
+ # (CLAUDE.md auto-done footgun).
72
+ #
73
+ # Validate (no model / no network):
74
+ # python3 -m pytest tests/test_combat_kite_and_pull.py -q
75
+
76
+ meta:
77
+ id: combat-kite-and-pull
78
+ title: 'Combat Micro — Kite and Pull a Slow Heavy Force'
79
+ capability: action
80
+ real_world_meaning: >
81
+ A fast light strike force must destroy a slower, heavier enemy
82
+ that out-trades it head-on. The only winning play is the
83
+ hit-and-PULL cycle: each turn, strike the heavy at weapon range,
84
+ then RETREAT the strike force out of the heavy's lethal
85
+ close-range window before it can fire back — and repeat. Standing
86
+ and fighting LOSES: the heavy cannon collapses the light force's
87
+ HP before its own runs out. The skill being measured is combat
88
+ micro under a mobility asymmetry — exploit the speed edge by
89
+ stringing together move-away + attack cycles instead of issuing
90
+ one beeline charge.
91
+ robotics_analogue: >
92
+ A fast/light agent team defeating a slow/heavy adversary by
93
+ exploiting a mobility asymmetry: a closed-loop evade-then-engage
94
+ policy rather than a one-shot commit. The per-turn decision is
95
+ proximity control — stay outside the adversary's lethal radius
96
+ while delivering effect at standoff range.
97
+ benchmark_anchor:
98
+ - "SC2 kiting micro"
99
+ - "cavalry skirmish doctrine"
100
+ - "military fire-and-maneuver doctrine"
101
+ - "economy-of-force"
102
+ author: openra-bench
103
+
104
+ base_map: rush-hour-arena
105
+
106
+ base:
107
+ agent: {faction: allies, cash: 0}
108
+ # `hunt` bot: the heavy actively PURSUES the raiders' centroid so
109
+ # the engagement starts on contact (no fog-blind opening) and the
110
+ # kite cycle has a moving target to pull. A stance:2 heavy left
111
+ # idle in the fog would never be discoverable; the hunt advance is
112
+ # what makes the heavy visible AND what the agent must out-pace.
113
+ enemy: {faction: soviet, cash: 0, bot_type: hunt}
114
+ tools: [observe, move_units, attack_unit, attack_move, stop]
115
+ planning: true
116
+ termination: {max_ticks: 7000}
117
+ actors: []
118
+
119
+ levels:
120
+ # ── EASY ────────────────────────────────────────────────────────
121
+ # Bare kite-and-pull skill: 3 medium-tank raiders vs ONE heavy
122
+ # (3tnk). Raiders stage off the heavy's latitude (north corridor
123
+ # y=10) so the kite cycle has a reactive window. Survival bar ≥2
124
+ # raiders. Stall LOSES (HoldFire raiders never engage → kill bar
125
+ # unmet → deadline). Stand-and-fight / brute LOSE (the heavy
126
+ # cannon out-trades the stack head-on).
127
+ easy:
128
+ description: >
129
+ Three fast medium-tank raiders (2tnk) stage at the centre-west
130
+ north corridor (y=10). ONE enemy heavy tank (3tnk) sits
131
+ centre-east on the mid latitude (x≈80, y=20). The heavy
132
+ out-trades your raiders at close range — standing and fighting
133
+ LOSES. The only winning play is the kite-and-PULL cycle: each
134
+ turn, if the heavy has closed within ~6 cells, MOVE your
135
+ raiders a few cells AWAY along the lane; otherwise attack_unit
136
+ the heavy from range; repeat. Kill the heavy and keep at least
137
+ TWO raiders alive before tick 4500. Stall (observe only) LOSES
138
+ — your raiders hold fire and never engage. Standing still or
139
+ bruting east with no disengage LOSES — the heavy cannon
140
+ collapses the stack.
141
+ overrides:
142
+ actors:
143
+ # RAIDERS — 3 medium tanks, stance:1 (ReturnFire): they
144
+ # return fire when shot but never open an engagement or
145
+ # advance on their own, so the agent must drive the kite
146
+ # cycle. Spread across three cells (a stack pin-piles in
147
+ # retreat).
148
+ - {type: 2tnk, owner: agent, position: [28, 9], stance: 1}
149
+ - {type: 2tnk, owner: agent, position: [30, 10], stance: 1}
150
+ - {type: 2tnk, owner: agent, position: [28, 11], stance: 1}
151
+ # THE HEAVY — soviet 3tnk under the hunt bot: it pursues
152
+ # the raiders' centroid. Mid latitude staging.
153
+ - {type: 3tnk, owner: enemy, position: [80, 20], stance: 2}
154
+ # Persistent unarmed far-east enemy marker — anti-DRAW.
155
+ - {type: fact, owner: enemy, position: [124, 20]}
156
+ win_condition:
157
+ all_of:
158
+ - {units_killed_gte: 1}
159
+ - {own_units_gte: 2}
160
+ - {within_ticks: 4500}
161
+ fail_condition:
162
+ any_of:
163
+ - {after_ticks: 4501}
164
+ - {not: {own_units_gte: 2}}
165
+ max_turns: 51
166
+
167
+ # ── MEDIUM ──────────────────────────────────────────────────────
168
+ # +1 controlled variable: the survival bar tightens to ALL THREE
169
+ # raiders (`own_units_gte:3`). The kite-and-pull must be PERFECT —
170
+ # a sloppy cycle that lets the heavy land even one cannon shot
171
+ # trades a raider and busts the bar. Same single-heavy diagonal-lag
172
+ # geometry as easy (two heavies are unkiteable by a 3-raider force
173
+ # in the engine combat sheet — verified — so the medium escalation
174
+ # is bar tightness, not enemy count).
175
+ medium:
176
+ description: >
177
+ Three medium-tank raiders (2tnk) stage at the centre-west
178
+ north corridor (y=10). ONE enemy heavy tank (3tnk) starts
179
+ centre-east on the mid latitude (x≈80, y=20) and HUNTS your
180
+ raiders. The heavy out-trades your raiders head-on. Win by
181
+ kiting: each turn, if the heavy is within ~6 cells MOVE your
182
+ raiders AWAY along the lane, else attack_unit the heavy;
183
+ repeat. Kill the heavy and keep ALL THREE raiders alive
184
+ before tick 4500 — a kite that lets the heavy land even one
185
+ cannon shot trades a raider and LOSES. Stall, stand-and-fight,
186
+ and brute attack_move all LOSE.
187
+ overrides:
188
+ actors:
189
+ - {type: 2tnk, owner: agent, position: [28, 9], stance: 1}
190
+ - {type: 2tnk, owner: agent, position: [30, 10], stance: 1}
191
+ - {type: 2tnk, owner: agent, position: [28, 11], stance: 1}
192
+ # ONE heavy on the mid latitude.
193
+ - {type: 3tnk, owner: enemy, position: [80, 20], stance: 2}
194
+ - {type: fact, owner: enemy, position: [124, 20]}
195
+ win_condition:
196
+ all_of:
197
+ - {units_killed_gte: 1}
198
+ - {own_units_gte: 3}
199
+ - {within_ticks: 4500}
200
+ fail_condition:
201
+ any_of:
202
+ - {after_ticks: 4501}
203
+ - {not: {own_units_gte: 3}}
204
+ max_turns: 51
205
+
206
+ # ── HARD ────────────────────────────────────────────────────────
207
+ # +2 controlled variables vs medium:
208
+ # 1. Tighter deadline (~3600 ticks) — the kite cadence must be
209
+ # efficient: dawdle and the clock LOSES.
210
+ # 2. TWO agent spawn_point groups (NORTH y=10 / SOUTH y=30
211
+ # corridor) round-robined by seed, so the pull vector flips
212
+ # per seed and a memorised "always retreat on y=10" opening
213
+ # cannot generalise. The heavy sits on the mid latitude
214
+ # (y=20) between the corridors so both spawns face a
215
+ # symmetric engagement geometry. The all-three survival bar
216
+ # carries over from medium.
217
+ hard:
218
+ description: >
219
+ Three medium-tank raiders (2tnk) stage at ONE of two
220
+ centre-west corridors (NORTH y=10 OR SOUTH y=30, chosen by
221
+ seed). ONE enemy heavy tank (3tnk) starts centre-east on the
222
+ mid latitude (y=20) between the two corridors and HUNTS your
223
+ raiders. The heavy out-trades your raiders head-on; the only
224
+ winning play is the kite-and-PULL cycle — when the heavy
225
+ closes within ~6 cells MOVE your raiders AWAY along your lane,
226
+ else attack_unit the heavy; repeat. Kill the heavy and keep
227
+ ALL THREE raiders alive before tick 3600. Stall,
228
+ stand-and-fight, and brute attack_move all LOSE. The start
229
+ corridor varies by seed so a memorised opening cannot
230
+ generalise.
231
+ overrides:
232
+ actors:
233
+ # spawn_point 0 — NORTH corridor (y=10)
234
+ - {type: 2tnk, owner: agent, position: [28, 9], stance: 1, spawn_point: 0}
235
+ - {type: 2tnk, owner: agent, position: [30, 10], stance: 1, spawn_point: 0}
236
+ - {type: 2tnk, owner: agent, position: [28, 11], stance: 1, spawn_point: 0}
237
+ # spawn_point 1 — SOUTH corridor (y=30)
238
+ - {type: 2tnk, owner: agent, position: [28, 29], stance: 1, spawn_point: 1}
239
+ - {type: 2tnk, owner: agent, position: [30, 30], stance: 1, spawn_point: 1}
240
+ - {type: 2tnk, owner: agent, position: [28, 31], stance: 1, spawn_point: 1}
241
+ # One heavy centred on the mid latitude — symmetric
242
+ # engagement geometry from either spawn corridor.
243
+ - {type: 3tnk, owner: enemy, position: [80, 20], stance: 2}
244
+ - {type: fact, owner: enemy, position: [124, 20]}
245
+ win_condition:
246
+ all_of:
247
+ - {units_killed_gte: 1}
248
+ - {own_units_gte: 3}
249
+ - {within_ticks: 3600}
250
+ fail_condition:
251
+ any_of:
252
+ - {after_ticks: 3601}
253
+ - {not: {own_units_gte: 3}}
254
+ max_turns: 41
tests/test_combat_kite_and_pull.py ADDED
@@ -0,0 +1,319 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """combat-kite-and-pull — ACTION capability validation.
2
+
3
+ Kiting micro: a fast light strike force must hit-and-PULL a slow
4
+ heavy enemy — strike at weapon range, retreat out of the heavy's
5
+ lethal close-range window before it can fire back, repeat. Standing
6
+ and fighting LOSES (the heavy cannon out-trades the raider stack
7
+ head-on); only the move-away + attack_unit kite cycle WINS.
8
+
9
+ Bar (CLAUDE.md "no defect, no cheat, no draw"):
10
+
11
+ * stall (observe-only) LOSES every tier / every hard seed — a
12
+ passive ReturnFire stack that never kites is overrun by the
13
+ hunting heavy → the survival bar fails / the deadline bites.
14
+ * stand-and-fight (attack_move onto the heavy, never retreat)
15
+ LOSES every tier / seed — the heavy cannon collapses the stack
16
+ head-on.
17
+ * brute / wrong-path (one attack_move far east, no disengage)
18
+ LOSES every tier / seed — same close-range trade.
19
+ * intended kite-and-pull (retreat when the heavy closes within
20
+ ~7 cells, else attack_unit) WINS every tier / every hard seed,
21
+ preserving ALL THREE raiders (own_units_gte:3 on medium/hard).
22
+ * hard tier defines ≥2 agent spawn_point groups (NORTH y=10 /
23
+ SOUTH y=30 corridor) round-robined by seed so a memorised
24
+ opening cannot generalise.
25
+ """
26
+
27
+ from __future__ import annotations
28
+
29
+ from pathlib import Path
30
+
31
+ import pytest
32
+
33
+ pytest.importorskip("openra_train", reason="Rust env wheel not installed")
34
+ pytest.importorskip("openra_rl_training", reason="Rust env wheel not installed")
35
+
36
+ from openra_bench.eval_core import run_level
37
+ from openra_bench.scenarios import load_pack
38
+ from openra_bench.scenarios.loader import PACKS_DIR, compile_level
39
+ from openra_bench.scenarios.win_conditions import WinContext, evaluate
40
+
41
+ PACK = PACKS_DIR / "combat-kite-and-pull.yaml"
42
+ LEVELS = ("easy", "medium", "hard")
43
+ SEEDS = (1, 2, 3, 4)
44
+
45
+
46
+ # ── scripted policies ───────────────────────────────────────────────
47
+
48
+
49
+ def _raiders(rs):
50
+ return [u for u in rs.get("units_summary", []) if u.get("type") == "2tnk"]
51
+
52
+
53
+ def _stall(rs, C):
54
+ """Observe-only. A passive ReturnFire stack that never kites is
55
+ overrun by the hunting heavy → LOSS."""
56
+ return [C.observe()]
57
+
58
+
59
+ def _stand(rs, C):
60
+ """Stand-and-fight: attack_move straight onto the heavy and never
61
+ retreat. The heavy cannon out-trades the stack head-on → LOSS."""
62
+ own = _raiders(rs)
63
+ if not own:
64
+ return [C.observe()]
65
+ return [C.attack_move([str(u["id"]) for u in own], target_x=81, target_y=20)]
66
+
67
+
68
+ def _brute(rs, C):
69
+ """Brute / wrong-path: one attack_move far east, no disengage.
70
+ Same close-range trade as stand-and-fight → LOSS."""
71
+ own = _raiders(rs)
72
+ if not own:
73
+ return [C.observe()]
74
+ return [
75
+ C.attack_move(
76
+ [str(u["id"]) for u in own], target_x=120, target_y=own[0]["cell_y"]
77
+ )
78
+ ]
79
+
80
+
81
+ def _kite(rs, C):
82
+ """Intended kite-and-pull: each turn, if the heavy has closed
83
+ within ~7 cells of a raider, MOVE that raider ~10 cells AWAY
84
+ along its lane (the PULL); otherwise attack_unit the heavy from
85
+ range (the STRIKE). The cycle is purely reactive — derived each
86
+ turn from geometry, no memory."""
87
+ own = _raiders(rs)
88
+ if not own:
89
+ return [C.observe()]
90
+ enemies = rs.get("enemy_summary") or []
91
+ heavies = [e for e in enemies if (e.get("type") or "").lower() == "3tnk"]
92
+ cmds = []
93
+ if heavies:
94
+ for u in own:
95
+ t = min(
96
+ heavies,
97
+ key=lambda e: abs(e["cell_x"] - u["cell_x"])
98
+ + abs(e["cell_y"] - u["cell_y"]),
99
+ )
100
+ d = abs(u["cell_x"] - t["cell_x"]) + abs(u["cell_y"] - t["cell_y"])
101
+ if d <= 7:
102
+ cmds.append(
103
+ C.move_units(
104
+ [str(u["id"])],
105
+ target_x=max(4, u["cell_x"] - 10),
106
+ target_y=u["cell_y"],
107
+ )
108
+ )
109
+ else:
110
+ cmds.append(C.attack_unit([str(u["id"])], str(t["id"])))
111
+ else:
112
+ # No vision yet — march east on the staging lane until the
113
+ # hunting heavy comes into sight.
114
+ cmds.append(
115
+ C.move_units(
116
+ [str(u["id"]) for u in own],
117
+ target_x=min(70, own[0]["cell_x"] + 10),
118
+ target_y=own[0]["cell_y"],
119
+ )
120
+ )
121
+ return cmds
122
+
123
+
124
+ # ── structural tests ────────────────────────────────────────────────
125
+
126
+
127
+ def test_pack_loads_and_meta_action():
128
+ pack = load_pack(PACK)
129
+ assert pack.meta.id == "combat-kite-and-pull"
130
+ assert pack.meta.capability == "action"
131
+ assert pack.meta.real_world_meaning
132
+ assert pack.meta.robotics_analogue
133
+ anchors = " ".join(pack.meta.benchmark_anchor).lower()
134
+ assert "sc2 kiting micro" in anchors, anchors
135
+ assert "cavalry skirmish doctrine" in anchors, anchors
136
+
137
+
138
+ def test_enemy_uses_hunt_bot_on_every_level():
139
+ """The heavy must HUNT — a stance:2 heavy idle in fog would never
140
+ be discoverable; the hunt advance brings it into vision."""
141
+ pack = load_pack(PACK)
142
+ for lvl in LEVELS:
143
+ c = compile_level(pack, lvl)
144
+ assert c.map_supported, f"{lvl}: rush-hour-arena terrain required"
145
+ enemy = c.scenario.enemy
146
+ bot = getattr(enemy, "bot_type", None) or getattr(enemy, "bot", None)
147
+ assert str(bot).lower() == "hunt", f"{lvl}: enemy bot must be 'hunt'; got {bot}"
148
+
149
+
150
+ def test_tools_are_combat_only():
151
+ pack = load_pack(PACK)
152
+ tools = set(pack.base.get("tools", []) if isinstance(pack.base, dict) else [])
153
+ for required in ("move_units", "attack_unit", "attack_move", "stop"):
154
+ assert required in tools, f"missing tool: {required!r}"
155
+ assert "build" not in tools, "this is a combat-micro pack — no build tool"
156
+
157
+
158
+ def test_every_level_has_reachable_timeout_fail():
159
+ """`after_ticks` fail must bite within max_turns; within_ticks+1
160
+ == after_ticks so a boundary non-finisher LOSES, not draws."""
161
+ pack = load_pack(PACK)
162
+ for lvl in LEVELS:
163
+ L = pack.levels[lvl]
164
+ ceiling = 93 + 90 * (L.max_turns - 1)
165
+ wt = next(
166
+ int(c["within_ticks"])
167
+ for c in L.win_condition.model_dump()["all_of"]
168
+ if "within_ticks" in c
169
+ )
170
+ ft = next(
171
+ int(c["after_ticks"])
172
+ for c in L.fail_condition.model_dump()["any_of"]
173
+ if "after_ticks" in c
174
+ )
175
+ assert wt < ceiling, f"{lvl}: within_ticks {wt} >= ceiling {ceiling}"
176
+ assert ft <= ceiling, f"{lvl}: after_ticks {ft} > ceiling {ceiling}"
177
+ assert wt + 1 == ft, f"{lvl}: within/after mismatch {wt}/{ft}"
178
+
179
+
180
+ def test_every_level_has_a_fail_condition():
181
+ pack = load_pack(PACK)
182
+ for lvl in LEVELS:
183
+ c = compile_level(pack, lvl)
184
+ assert c.fail_condition is not None, f"{lvl} needs a fail_condition"
185
+
186
+
187
+ def test_medium_and_hard_require_all_three_raiders():
188
+ """The tightened pull bar: medium/hard win only if ALL THREE
189
+ raiders survive (own_units_gte:3)."""
190
+ pack = load_pack(PACK)
191
+ for lvl in ("medium", "hard"):
192
+ L = pack.levels[lvl]
193
+ bar = next(
194
+ int(c["own_units_gte"])
195
+ for c in L.win_condition.model_dump()["all_of"]
196
+ if "own_units_gte" in c
197
+ )
198
+ assert bar == 3, f"{lvl}: survival bar must be 3; got {bar}"
199
+
200
+
201
+ def test_hard_has_two_seed_driven_spawn_groups():
202
+ c = compile_level(load_pack(PACK), "hard")
203
+ sp = {
204
+ (a.spawn_point if a.spawn_point is not None else 0)
205
+ for a in c.scenario.actors
206
+ if a.owner == "agent"
207
+ }
208
+ assert sp == {0, 1}, f"hard must define spawn_point groups {{0,1}}; got {sorted(sp)}"
209
+
210
+
211
+ def test_in_bounds_actors_on_every_level():
212
+ pack = load_pack(PACK)
213
+ for lvl in LEVELS:
214
+ c = compile_level(pack, lvl)
215
+ for a in c.scenario.actors:
216
+ x, y = a.position
217
+ assert 2 <= x <= 126 and 2 <= y <= 38, (
218
+ f"{lvl}: actor {a.type} at ({x},{y}) out of bounds"
219
+ )
220
+
221
+
222
+ # ── predicate-level (no engine) ─────────────────────────────────────
223
+
224
+
225
+ def _ctx(*, tick=0, killed=0, n_units=3):
226
+ import types
227
+
228
+ sig = types.SimpleNamespace(
229
+ game_tick=tick,
230
+ units_killed=killed,
231
+ units_lost=3 - n_units,
232
+ own_buildings=[],
233
+ own_building_types=set(),
234
+ enemies_seen_ids=set(),
235
+ enemy_buildings_seen_ids=set(),
236
+ )
237
+ return WinContext(
238
+ signals=sig,
239
+ render_state={
240
+ "units_summary": [
241
+ {"cell_x": 28, "cell_y": 10} for _ in range(n_units)
242
+ ]
243
+ },
244
+ )
245
+
246
+
247
+ def test_predicates_enforce_kill_and_survival():
248
+ pe = compile_level(load_pack(PACK), "easy")
249
+ # easy: kill 1, ≥2 alive, in time → WIN
250
+ assert evaluate(pe.win_condition, _ctx(tick=1000, killed=1, n_units=2))
251
+ # easy: kill 0 → not win
252
+ assert not evaluate(pe.win_condition, _ctx(tick=1000, killed=0, n_units=3))
253
+ # easy: 1 raider left → fail (need ≥2)
254
+ assert evaluate(pe.fail_condition, _ctx(tick=1000, killed=1, n_units=1))
255
+
256
+ pm = compile_level(load_pack(PACK), "medium")
257
+ # medium: all 3 alive + kill → WIN
258
+ assert evaluate(pm.win_condition, _ctx(tick=1000, killed=1, n_units=3))
259
+ # medium: only 2 alive → not win, and fail fires
260
+ assert not evaluate(pm.win_condition, _ctx(tick=1000, killed=1, n_units=2))
261
+ assert evaluate(pm.fail_condition, _ctx(tick=1000, killed=1, n_units=2))
262
+ # medium: past deadline → fail
263
+ assert evaluate(pm.fail_condition, _ctx(tick=4502, killed=0, n_units=3))
264
+
265
+
266
+ # ── engine-driven: every lazy/wrong policy LOSES, intended WINS ──────
267
+
268
+
269
+ @pytest.mark.parametrize("level", LEVELS)
270
+ @pytest.mark.parametrize("seed", SEEDS)
271
+ def test_stall_loses_every_tier_and_seed(level, seed):
272
+ c = compile_level(load_pack(PACK), level)
273
+ r = run_level(c, _stall, seed=seed)
274
+ assert r.outcome == "loss", (
275
+ f"{level}/seed{seed}: stall must LOSE; got {r.outcome} "
276
+ f"killed={r.signals.units_killed} lost={r.signals.units_lost}"
277
+ )
278
+
279
+
280
+ @pytest.mark.parametrize("level", LEVELS)
281
+ @pytest.mark.parametrize("seed", SEEDS)
282
+ def test_stand_and_fight_loses_every_tier_and_seed(level, seed):
283
+ c = compile_level(load_pack(PACK), level)
284
+ r = run_level(c, _stand, seed=seed)
285
+ assert r.outcome == "loss", (
286
+ f"{level}/seed{seed}: stand-and-fight must LOSE; got {r.outcome} "
287
+ f"killed={r.signals.units_killed} lost={r.signals.units_lost}"
288
+ )
289
+
290
+
291
+ @pytest.mark.parametrize("level", LEVELS)
292
+ @pytest.mark.parametrize("seed", SEEDS)
293
+ def test_brute_loses_every_tier_and_seed(level, seed):
294
+ c = compile_level(load_pack(PACK), level)
295
+ r = run_level(c, _brute, seed=seed)
296
+ assert r.outcome == "loss", (
297
+ f"{level}/seed{seed}: brute attack_move must LOSE; got {r.outcome} "
298
+ f"killed={r.signals.units_killed} lost={r.signals.units_lost}"
299
+ )
300
+
301
+
302
+ @pytest.mark.parametrize("level", LEVELS)
303
+ @pytest.mark.parametrize("seed", SEEDS)
304
+ def test_kite_wins_every_tier_and_seed(level, seed):
305
+ c = compile_level(load_pack(PACK), level)
306
+ r = run_level(c, _kite, seed=seed)
307
+ assert r.outcome == "win", (
308
+ f"{level}/seed{seed}: kite-and-pull must WIN; got {r.outcome} "
309
+ f"killed={r.signals.units_killed} lost={r.signals.units_lost}"
310
+ )
311
+
312
+
313
+ def test_kite_run_is_deterministic_per_seed():
314
+ c = compile_level(load_pack(PACK), "medium")
315
+ a = run_level(c, _kite, seed=2)
316
+ b = run_level(c, _kite, seed=2)
317
+ assert (a.outcome, a.turns, a.signals.units_killed) == (
318
+ b.outcome, b.turns, b.signals.units_killed
319
+ )
tests/test_hard_tier.py CHANGED
@@ -1448,6 +1448,7 @@ UPGRADED = [
1448
  "econ-quantitative-vs-qualitative-spend", # hard: 2 agent spawn_point groups
1449
  "def-tower-line-vs-cluster", # hard: 2 agent spawn_point groups
1450
  "coord-cover-and-move", # hard: 2 agent spawn_point groups
 
1451
  ]
1452
 
1453
  # Consciously NOT spawn-varied, with the reason (keeps the curation
 
1448
  "econ-quantitative-vs-qualitative-spend", # hard: 2 agent spawn_point groups
1449
  "def-tower-line-vs-cluster", # hard: 2 agent spawn_point groups
1450
  "coord-cover-and-move", # hard: 2 agent spawn_point groups
1451
+ "combat-kite-and-pull", # hard: 2 agent spawn_point groups (Wave-12)
1452
  ]
1453
 
1454
  # Consciously NOT spawn-varied, with the reason (keeps the curation