Spaces:
Running
Running
feat(scenario): combat-kite-and-pull — hit-and-pull kiting micro vs a slow heavy (SC2 kiting micro)
Browse filesACTION pack: fast 2tnk raiders must strike-then-PULL a hunting 3tnk
heavy — fire at range, retreat out of the heavy's lethal close-range
window, repeat. Stand-and-fight, brute, and stall all LOSE; only the
move-away + attack_unit kite cycle WINS. Medium/hard tighten the bar
to a perfect pull (all three raiders survive). Hard defines two
seed-driven spawn_point corridors.
openra_bench/scenarios/packs/combat-kite-and-pull.yaml
ADDED
|
@@ -0,0 +1,254 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# combat-kite-and-pull — ACTION, kiting micro: hit-and-PULL a slow
|
| 2 |
+
# heavy enemy with a fast light strike force (Wave-12).
|
| 3 |
+
#
|
| 4 |
+
# The capability under test is the STRIKE-then-PULL cycle: a fast
|
| 5 |
+
# light unit closes to weapon range, FIRES, then retreats out of the
|
| 6 |
+
# heavy's lethal close-range window BEFORE the heavy can fire back —
|
| 7 |
+
# and repeats. The heavy out-trades the light force head-on, so
|
| 8 |
+
# standing and fighting LOSES; the light force's speed advantage is
|
| 9 |
+
# the only edge, and it only pays off if the agent strings together
|
| 10 |
+
# the move-away + attack_unit cycle every turn.
|
| 11 |
+
#
|
| 12 |
+
# Real-world anchors:
|
| 13 |
+
# • SC2 kiting micro (vulture/muta vs marines, stalker-vs-zealot):
|
| 14 |
+
# fast unit fires, steps back before the slower foe closes,
|
| 15 |
+
# re-engages. The "blink-back" / micro-dance.
|
| 16 |
+
# • Cavalry skirmish doctrine: light cavalry charges to contact,
|
| 17 |
+
# looses a volley, then WHEELS AWAY before the heavier line can
|
| 18 |
+
# engage — fire-and-maneuver, never a sustained melee.
|
| 19 |
+
# • Economy-of-force: a small mobile force defeats a heavier,
|
| 20 |
+
# concentrated foe by exploiting a mobility asymmetry rather
|
| 21 |
+
# than by mass.
|
| 22 |
+
#
|
| 23 |
+
# Distinct from `combat-kite-jeep-vs-tank` (Wave-4): that pack frames
|
| 24 |
+
# the trade as "preserve ≥2 of 3"; this pack tightens the bar to a
|
| 25 |
+
# perfect PULL — medium and hard require ALL THREE raiders to survive
|
| 26 |
+
# (`own_units_gte:3` — a sloppy kite that trades even one raider for
|
| 27 |
+
# the heavy LOSES) and carries an explicit "no-disengage" brute /
|
| 28 |
+
# wrong-path policy in its four-policy bar. The shared idiom —
|
| 29 |
+
# diagonal-lag geometry, hunt-bot heavy, the move-away + attack_unit
|
| 30 |
+
# cycle — is the proven engine-realised kite test
|
| 31 |
+
# (`combat-kite-jeep-vs-tank` medium/hard).
|
| 32 |
+
#
|
| 33 |
+
# Engine-realised pairing note: in OpenRA-Rust the literal jeep MG is
|
| 34 |
+
# anti-infantry and does not dent heavy armour (engine weapons
|
| 35 |
+
# table), so the fast "raider" is the allied medium tank 2tnk
|
| 36 |
+
# (faster than the soviet 3tnk heavy, and its cannon CAN damage heavy
|
| 37 |
+
# armour). The capability under test is the kite-and-pull cycle —
|
| 38 |
+
# the unit pairing is the vehicle for that test, not the point.
|
| 39 |
+
#
|
| 40 |
+
# The four-policy bar (CLAUDE.md "no defect, no cheat, no draw"):
|
| 41 |
+
# • stall (observe only) → LOSS. The raiders are
|
| 42 |
+
# stance:1 (ReturnFire) — they auto-return fire ONLY after the
|
| 43 |
+
# heavy shoots them, but a passive stack that never kites is
|
| 44 |
+
# out-traded and overrun by the closing heavy force → the
|
| 45 |
+
# survival bar fails and/or the `after_ticks` deadline bites.
|
| 46 |
+
# • stand-and-fight (attack_move onto the heavy, never retreat)
|
| 47 |
+
# → LOSS. The heavy cannon out-trades the raider stack head-on;
|
| 48 |
+
# the raiders die before the heavy's HP runs out → the survival
|
| 49 |
+
# bar (own_units_gte) fails.
|
| 50 |
+
# • brute / wrong-path (one attack_move order far east, chase the
|
| 51 |
+
# heavy with no disengage) → LOSS. Same close-range trade as
|
| 52 |
+
# stand-and-fight; no kite cycle ⇒ the raiders are overrun.
|
| 53 |
+
# • intended kite-and-pull (each turn: if the heavy is within ~6
|
| 54 |
+
# cells, MOVE the raiders AWAY along the lane; else attack_unit
|
| 55 |
+
# the heavy; repeat) → WIN. The speed advantage keeps the heavy
|
| 56 |
+
# at the edge of the raiders' fire envelope, whittling it down
|
| 57 |
+
# across fire-then-retreat cycles while preserving the survival
|
| 58 |
+
# bar.
|
| 59 |
+
#
|
| 60 |
+
# Topology (rush-hour-arena, 128×40, playable x 2..126, y 2..38):
|
| 61 |
+
# • Raiders stage centre-west, spread across three cells (not
|
| 62 |
+
# stacked — a stack pin-piles in retreat).
|
| 63 |
+
# • The heavy starts centre-east on the MID latitude (y=20) under
|
| 64 |
+
# the hunt bot so it pursues the raiders' centroid — the agent
|
| 65 |
+
# must out-pace it, and the hunt advance is what brings the
|
| 66 |
+
# heavy into vision so the kite cycle has a target.
|
| 67 |
+
# • Raiders stage OFF the heavy's latitude (a north corridor on
|
| 68 |
+
# easy/medium) so the kite cycle has a reactive y-axis window.
|
| 69 |
+
# • Persistent unarmed enemy `fact` far east keeps the episode
|
| 70 |
+
# alive past the heavy's death so the win/fail evaluator runs
|
| 71 |
+
# (CLAUDE.md auto-done footgun).
|
| 72 |
+
#
|
| 73 |
+
# Validate (no model / no network):
|
| 74 |
+
# python3 -m pytest tests/test_combat_kite_and_pull.py -q
|
| 75 |
+
|
| 76 |
+
meta:
|
| 77 |
+
id: combat-kite-and-pull
|
| 78 |
+
title: 'Combat Micro — Kite and Pull a Slow Heavy Force'
|
| 79 |
+
capability: action
|
| 80 |
+
real_world_meaning: >
|
| 81 |
+
A fast light strike force must destroy a slower, heavier enemy
|
| 82 |
+
that out-trades it head-on. The only winning play is the
|
| 83 |
+
hit-and-PULL cycle: each turn, strike the heavy at weapon range,
|
| 84 |
+
then RETREAT the strike force out of the heavy's lethal
|
| 85 |
+
close-range window before it can fire back — and repeat. Standing
|
| 86 |
+
and fighting LOSES: the heavy cannon collapses the light force's
|
| 87 |
+
HP before its own runs out. The skill being measured is combat
|
| 88 |
+
micro under a mobility asymmetry — exploit the speed edge by
|
| 89 |
+
stringing together move-away + attack cycles instead of issuing
|
| 90 |
+
one beeline charge.
|
| 91 |
+
robotics_analogue: >
|
| 92 |
+
A fast/light agent team defeating a slow/heavy adversary by
|
| 93 |
+
exploiting a mobility asymmetry: a closed-loop evade-then-engage
|
| 94 |
+
policy rather than a one-shot commit. The per-turn decision is
|
| 95 |
+
proximity control — stay outside the adversary's lethal radius
|
| 96 |
+
while delivering effect at standoff range.
|
| 97 |
+
benchmark_anchor:
|
| 98 |
+
- "SC2 kiting micro"
|
| 99 |
+
- "cavalry skirmish doctrine"
|
| 100 |
+
- "military fire-and-maneuver doctrine"
|
| 101 |
+
- "economy-of-force"
|
| 102 |
+
author: openra-bench
|
| 103 |
+
|
| 104 |
+
base_map: rush-hour-arena
|
| 105 |
+
|
| 106 |
+
base:
|
| 107 |
+
agent: {faction: allies, cash: 0}
|
| 108 |
+
# `hunt` bot: the heavy actively PURSUES the raiders' centroid so
|
| 109 |
+
# the engagement starts on contact (no fog-blind opening) and the
|
| 110 |
+
# kite cycle has a moving target to pull. A stance:2 heavy left
|
| 111 |
+
# idle in the fog would never be discoverable; the hunt advance is
|
| 112 |
+
# what makes the heavy visible AND what the agent must out-pace.
|
| 113 |
+
enemy: {faction: soviet, cash: 0, bot_type: hunt}
|
| 114 |
+
tools: [observe, move_units, attack_unit, attack_move, stop]
|
| 115 |
+
planning: true
|
| 116 |
+
termination: {max_ticks: 7000}
|
| 117 |
+
actors: []
|
| 118 |
+
|
| 119 |
+
levels:
|
| 120 |
+
# ── EASY ────────────────────────────────────────────────────────
|
| 121 |
+
# Bare kite-and-pull skill: 3 medium-tank raiders vs ONE heavy
|
| 122 |
+
# (3tnk). Raiders stage off the heavy's latitude (north corridor
|
| 123 |
+
# y=10) so the kite cycle has a reactive window. Survival bar ≥2
|
| 124 |
+
# raiders. Stall LOSES (HoldFire raiders never engage → kill bar
|
| 125 |
+
# unmet → deadline). Stand-and-fight / brute LOSE (the heavy
|
| 126 |
+
# cannon out-trades the stack head-on).
|
| 127 |
+
easy:
|
| 128 |
+
description: >
|
| 129 |
+
Three fast medium-tank raiders (2tnk) stage at the centre-west
|
| 130 |
+
north corridor (y=10). ONE enemy heavy tank (3tnk) sits
|
| 131 |
+
centre-east on the mid latitude (x≈80, y=20). The heavy
|
| 132 |
+
out-trades your raiders at close range — standing and fighting
|
| 133 |
+
LOSES. The only winning play is the kite-and-PULL cycle: each
|
| 134 |
+
turn, if the heavy has closed within ~6 cells, MOVE your
|
| 135 |
+
raiders a few cells AWAY along the lane; otherwise attack_unit
|
| 136 |
+
the heavy from range; repeat. Kill the heavy and keep at least
|
| 137 |
+
TWO raiders alive before tick 4500. Stall (observe only) LOSES
|
| 138 |
+
— your raiders hold fire and never engage. Standing still or
|
| 139 |
+
bruting east with no disengage LOSES — the heavy cannon
|
| 140 |
+
collapses the stack.
|
| 141 |
+
overrides:
|
| 142 |
+
actors:
|
| 143 |
+
# RAIDERS — 3 medium tanks, stance:1 (ReturnFire): they
|
| 144 |
+
# return fire when shot but never open an engagement or
|
| 145 |
+
# advance on their own, so the agent must drive the kite
|
| 146 |
+
# cycle. Spread across three cells (a stack pin-piles in
|
| 147 |
+
# retreat).
|
| 148 |
+
- {type: 2tnk, owner: agent, position: [28, 9], stance: 1}
|
| 149 |
+
- {type: 2tnk, owner: agent, position: [30, 10], stance: 1}
|
| 150 |
+
- {type: 2tnk, owner: agent, position: [28, 11], stance: 1}
|
| 151 |
+
# THE HEAVY — soviet 3tnk under the hunt bot: it pursues
|
| 152 |
+
# the raiders' centroid. Mid latitude staging.
|
| 153 |
+
- {type: 3tnk, owner: enemy, position: [80, 20], stance: 2}
|
| 154 |
+
# Persistent unarmed far-east enemy marker — anti-DRAW.
|
| 155 |
+
- {type: fact, owner: enemy, position: [124, 20]}
|
| 156 |
+
win_condition:
|
| 157 |
+
all_of:
|
| 158 |
+
- {units_killed_gte: 1}
|
| 159 |
+
- {own_units_gte: 2}
|
| 160 |
+
- {within_ticks: 4500}
|
| 161 |
+
fail_condition:
|
| 162 |
+
any_of:
|
| 163 |
+
- {after_ticks: 4501}
|
| 164 |
+
- {not: {own_units_gte: 2}}
|
| 165 |
+
max_turns: 51
|
| 166 |
+
|
| 167 |
+
# ── MEDIUM ──────────────────────────────────────────────────────
|
| 168 |
+
# +1 controlled variable: the survival bar tightens to ALL THREE
|
| 169 |
+
# raiders (`own_units_gte:3`). The kite-and-pull must be PERFECT —
|
| 170 |
+
# a sloppy cycle that lets the heavy land even one cannon shot
|
| 171 |
+
# trades a raider and busts the bar. Same single-heavy diagonal-lag
|
| 172 |
+
# geometry as easy (two heavies are unkiteable by a 3-raider force
|
| 173 |
+
# in the engine combat sheet — verified — so the medium escalation
|
| 174 |
+
# is bar tightness, not enemy count).
|
| 175 |
+
medium:
|
| 176 |
+
description: >
|
| 177 |
+
Three medium-tank raiders (2tnk) stage at the centre-west
|
| 178 |
+
north corridor (y=10). ONE enemy heavy tank (3tnk) starts
|
| 179 |
+
centre-east on the mid latitude (x≈80, y=20) and HUNTS your
|
| 180 |
+
raiders. The heavy out-trades your raiders head-on. Win by
|
| 181 |
+
kiting: each turn, if the heavy is within ~6 cells MOVE your
|
| 182 |
+
raiders AWAY along the lane, else attack_unit the heavy;
|
| 183 |
+
repeat. Kill the heavy and keep ALL THREE raiders alive
|
| 184 |
+
before tick 4500 — a kite that lets the heavy land even one
|
| 185 |
+
cannon shot trades a raider and LOSES. Stall, stand-and-fight,
|
| 186 |
+
and brute attack_move all LOSE.
|
| 187 |
+
overrides:
|
| 188 |
+
actors:
|
| 189 |
+
- {type: 2tnk, owner: agent, position: [28, 9], stance: 1}
|
| 190 |
+
- {type: 2tnk, owner: agent, position: [30, 10], stance: 1}
|
| 191 |
+
- {type: 2tnk, owner: agent, position: [28, 11], stance: 1}
|
| 192 |
+
# ONE heavy on the mid latitude.
|
| 193 |
+
- {type: 3tnk, owner: enemy, position: [80, 20], stance: 2}
|
| 194 |
+
- {type: fact, owner: enemy, position: [124, 20]}
|
| 195 |
+
win_condition:
|
| 196 |
+
all_of:
|
| 197 |
+
- {units_killed_gte: 1}
|
| 198 |
+
- {own_units_gte: 3}
|
| 199 |
+
- {within_ticks: 4500}
|
| 200 |
+
fail_condition:
|
| 201 |
+
any_of:
|
| 202 |
+
- {after_ticks: 4501}
|
| 203 |
+
- {not: {own_units_gte: 3}}
|
| 204 |
+
max_turns: 51
|
| 205 |
+
|
| 206 |
+
# ── HARD ────────────────────────────────────────────────────────
|
| 207 |
+
# +2 controlled variables vs medium:
|
| 208 |
+
# 1. Tighter deadline (~3600 ticks) — the kite cadence must be
|
| 209 |
+
# efficient: dawdle and the clock LOSES.
|
| 210 |
+
# 2. TWO agent spawn_point groups (NORTH y=10 / SOUTH y=30
|
| 211 |
+
# corridor) round-robined by seed, so the pull vector flips
|
| 212 |
+
# per seed and a memorised "always retreat on y=10" opening
|
| 213 |
+
# cannot generalise. The heavy sits on the mid latitude
|
| 214 |
+
# (y=20) between the corridors so both spawns face a
|
| 215 |
+
# symmetric engagement geometry. The all-three survival bar
|
| 216 |
+
# carries over from medium.
|
| 217 |
+
hard:
|
| 218 |
+
description: >
|
| 219 |
+
Three medium-tank raiders (2tnk) stage at ONE of two
|
| 220 |
+
centre-west corridors (NORTH y=10 OR SOUTH y=30, chosen by
|
| 221 |
+
seed). ONE enemy heavy tank (3tnk) starts centre-east on the
|
| 222 |
+
mid latitude (y=20) between the two corridors and HUNTS your
|
| 223 |
+
raiders. The heavy out-trades your raiders head-on; the only
|
| 224 |
+
winning play is the kite-and-PULL cycle — when the heavy
|
| 225 |
+
closes within ~6 cells MOVE your raiders AWAY along your lane,
|
| 226 |
+
else attack_unit the heavy; repeat. Kill the heavy and keep
|
| 227 |
+
ALL THREE raiders alive before tick 3600. Stall,
|
| 228 |
+
stand-and-fight, and brute attack_move all LOSE. The start
|
| 229 |
+
corridor varies by seed so a memorised opening cannot
|
| 230 |
+
generalise.
|
| 231 |
+
overrides:
|
| 232 |
+
actors:
|
| 233 |
+
# spawn_point 0 — NORTH corridor (y=10)
|
| 234 |
+
- {type: 2tnk, owner: agent, position: [28, 9], stance: 1, spawn_point: 0}
|
| 235 |
+
- {type: 2tnk, owner: agent, position: [30, 10], stance: 1, spawn_point: 0}
|
| 236 |
+
- {type: 2tnk, owner: agent, position: [28, 11], stance: 1, spawn_point: 0}
|
| 237 |
+
# spawn_point 1 — SOUTH corridor (y=30)
|
| 238 |
+
- {type: 2tnk, owner: agent, position: [28, 29], stance: 1, spawn_point: 1}
|
| 239 |
+
- {type: 2tnk, owner: agent, position: [30, 30], stance: 1, spawn_point: 1}
|
| 240 |
+
- {type: 2tnk, owner: agent, position: [28, 31], stance: 1, spawn_point: 1}
|
| 241 |
+
# One heavy centred on the mid latitude — symmetric
|
| 242 |
+
# engagement geometry from either spawn corridor.
|
| 243 |
+
- {type: 3tnk, owner: enemy, position: [80, 20], stance: 2}
|
| 244 |
+
- {type: fact, owner: enemy, position: [124, 20]}
|
| 245 |
+
win_condition:
|
| 246 |
+
all_of:
|
| 247 |
+
- {units_killed_gte: 1}
|
| 248 |
+
- {own_units_gte: 3}
|
| 249 |
+
- {within_ticks: 3600}
|
| 250 |
+
fail_condition:
|
| 251 |
+
any_of:
|
| 252 |
+
- {after_ticks: 3601}
|
| 253 |
+
- {not: {own_units_gte: 3}}
|
| 254 |
+
max_turns: 41
|
tests/test_combat_kite_and_pull.py
ADDED
|
@@ -0,0 +1,319 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""combat-kite-and-pull — ACTION capability validation.
|
| 2 |
+
|
| 3 |
+
Kiting micro: a fast light strike force must hit-and-PULL a slow
|
| 4 |
+
heavy enemy — strike at weapon range, retreat out of the heavy's
|
| 5 |
+
lethal close-range window before it can fire back, repeat. Standing
|
| 6 |
+
and fighting LOSES (the heavy cannon out-trades the raider stack
|
| 7 |
+
head-on); only the move-away + attack_unit kite cycle WINS.
|
| 8 |
+
|
| 9 |
+
Bar (CLAUDE.md "no defect, no cheat, no draw"):
|
| 10 |
+
|
| 11 |
+
* stall (observe-only) LOSES every tier / every hard seed — a
|
| 12 |
+
passive ReturnFire stack that never kites is overrun by the
|
| 13 |
+
hunting heavy → the survival bar fails / the deadline bites.
|
| 14 |
+
* stand-and-fight (attack_move onto the heavy, never retreat)
|
| 15 |
+
LOSES every tier / seed — the heavy cannon collapses the stack
|
| 16 |
+
head-on.
|
| 17 |
+
* brute / wrong-path (one attack_move far east, no disengage)
|
| 18 |
+
LOSES every tier / seed — same close-range trade.
|
| 19 |
+
* intended kite-and-pull (retreat when the heavy closes within
|
| 20 |
+
~7 cells, else attack_unit) WINS every tier / every hard seed,
|
| 21 |
+
preserving ALL THREE raiders (own_units_gte:3 on medium/hard).
|
| 22 |
+
* hard tier defines ≥2 agent spawn_point groups (NORTH y=10 /
|
| 23 |
+
SOUTH y=30 corridor) round-robined by seed so a memorised
|
| 24 |
+
opening cannot generalise.
|
| 25 |
+
"""
|
| 26 |
+
|
| 27 |
+
from __future__ import annotations
|
| 28 |
+
|
| 29 |
+
from pathlib import Path
|
| 30 |
+
|
| 31 |
+
import pytest
|
| 32 |
+
|
| 33 |
+
pytest.importorskip("openra_train", reason="Rust env wheel not installed")
|
| 34 |
+
pytest.importorskip("openra_rl_training", reason="Rust env wheel not installed")
|
| 35 |
+
|
| 36 |
+
from openra_bench.eval_core import run_level
|
| 37 |
+
from openra_bench.scenarios import load_pack
|
| 38 |
+
from openra_bench.scenarios.loader import PACKS_DIR, compile_level
|
| 39 |
+
from openra_bench.scenarios.win_conditions import WinContext, evaluate
|
| 40 |
+
|
| 41 |
+
PACK = PACKS_DIR / "combat-kite-and-pull.yaml"
|
| 42 |
+
LEVELS = ("easy", "medium", "hard")
|
| 43 |
+
SEEDS = (1, 2, 3, 4)
|
| 44 |
+
|
| 45 |
+
|
| 46 |
+
# ── scripted policies ───────────────────────────────────────────────
|
| 47 |
+
|
| 48 |
+
|
| 49 |
+
def _raiders(rs):
|
| 50 |
+
return [u for u in rs.get("units_summary", []) if u.get("type") == "2tnk"]
|
| 51 |
+
|
| 52 |
+
|
| 53 |
+
def _stall(rs, C):
|
| 54 |
+
"""Observe-only. A passive ReturnFire stack that never kites is
|
| 55 |
+
overrun by the hunting heavy → LOSS."""
|
| 56 |
+
return [C.observe()]
|
| 57 |
+
|
| 58 |
+
|
| 59 |
+
def _stand(rs, C):
|
| 60 |
+
"""Stand-and-fight: attack_move straight onto the heavy and never
|
| 61 |
+
retreat. The heavy cannon out-trades the stack head-on → LOSS."""
|
| 62 |
+
own = _raiders(rs)
|
| 63 |
+
if not own:
|
| 64 |
+
return [C.observe()]
|
| 65 |
+
return [C.attack_move([str(u["id"]) for u in own], target_x=81, target_y=20)]
|
| 66 |
+
|
| 67 |
+
|
| 68 |
+
def _brute(rs, C):
|
| 69 |
+
"""Brute / wrong-path: one attack_move far east, no disengage.
|
| 70 |
+
Same close-range trade as stand-and-fight → LOSS."""
|
| 71 |
+
own = _raiders(rs)
|
| 72 |
+
if not own:
|
| 73 |
+
return [C.observe()]
|
| 74 |
+
return [
|
| 75 |
+
C.attack_move(
|
| 76 |
+
[str(u["id"]) for u in own], target_x=120, target_y=own[0]["cell_y"]
|
| 77 |
+
)
|
| 78 |
+
]
|
| 79 |
+
|
| 80 |
+
|
| 81 |
+
def _kite(rs, C):
|
| 82 |
+
"""Intended kite-and-pull: each turn, if the heavy has closed
|
| 83 |
+
within ~7 cells of a raider, MOVE that raider ~10 cells AWAY
|
| 84 |
+
along its lane (the PULL); otherwise attack_unit the heavy from
|
| 85 |
+
range (the STRIKE). The cycle is purely reactive — derived each
|
| 86 |
+
turn from geometry, no memory."""
|
| 87 |
+
own = _raiders(rs)
|
| 88 |
+
if not own:
|
| 89 |
+
return [C.observe()]
|
| 90 |
+
enemies = rs.get("enemy_summary") or []
|
| 91 |
+
heavies = [e for e in enemies if (e.get("type") or "").lower() == "3tnk"]
|
| 92 |
+
cmds = []
|
| 93 |
+
if heavies:
|
| 94 |
+
for u in own:
|
| 95 |
+
t = min(
|
| 96 |
+
heavies,
|
| 97 |
+
key=lambda e: abs(e["cell_x"] - u["cell_x"])
|
| 98 |
+
+ abs(e["cell_y"] - u["cell_y"]),
|
| 99 |
+
)
|
| 100 |
+
d = abs(u["cell_x"] - t["cell_x"]) + abs(u["cell_y"] - t["cell_y"])
|
| 101 |
+
if d <= 7:
|
| 102 |
+
cmds.append(
|
| 103 |
+
C.move_units(
|
| 104 |
+
[str(u["id"])],
|
| 105 |
+
target_x=max(4, u["cell_x"] - 10),
|
| 106 |
+
target_y=u["cell_y"],
|
| 107 |
+
)
|
| 108 |
+
)
|
| 109 |
+
else:
|
| 110 |
+
cmds.append(C.attack_unit([str(u["id"])], str(t["id"])))
|
| 111 |
+
else:
|
| 112 |
+
# No vision yet — march east on the staging lane until the
|
| 113 |
+
# hunting heavy comes into sight.
|
| 114 |
+
cmds.append(
|
| 115 |
+
C.move_units(
|
| 116 |
+
[str(u["id"]) for u in own],
|
| 117 |
+
target_x=min(70, own[0]["cell_x"] + 10),
|
| 118 |
+
target_y=own[0]["cell_y"],
|
| 119 |
+
)
|
| 120 |
+
)
|
| 121 |
+
return cmds
|
| 122 |
+
|
| 123 |
+
|
| 124 |
+
# ── structural tests ────────────────────────────────────────────────
|
| 125 |
+
|
| 126 |
+
|
| 127 |
+
def test_pack_loads_and_meta_action():
|
| 128 |
+
pack = load_pack(PACK)
|
| 129 |
+
assert pack.meta.id == "combat-kite-and-pull"
|
| 130 |
+
assert pack.meta.capability == "action"
|
| 131 |
+
assert pack.meta.real_world_meaning
|
| 132 |
+
assert pack.meta.robotics_analogue
|
| 133 |
+
anchors = " ".join(pack.meta.benchmark_anchor).lower()
|
| 134 |
+
assert "sc2 kiting micro" in anchors, anchors
|
| 135 |
+
assert "cavalry skirmish doctrine" in anchors, anchors
|
| 136 |
+
|
| 137 |
+
|
| 138 |
+
def test_enemy_uses_hunt_bot_on_every_level():
|
| 139 |
+
"""The heavy must HUNT — a stance:2 heavy idle in fog would never
|
| 140 |
+
be discoverable; the hunt advance brings it into vision."""
|
| 141 |
+
pack = load_pack(PACK)
|
| 142 |
+
for lvl in LEVELS:
|
| 143 |
+
c = compile_level(pack, lvl)
|
| 144 |
+
assert c.map_supported, f"{lvl}: rush-hour-arena terrain required"
|
| 145 |
+
enemy = c.scenario.enemy
|
| 146 |
+
bot = getattr(enemy, "bot_type", None) or getattr(enemy, "bot", None)
|
| 147 |
+
assert str(bot).lower() == "hunt", f"{lvl}: enemy bot must be 'hunt'; got {bot}"
|
| 148 |
+
|
| 149 |
+
|
| 150 |
+
def test_tools_are_combat_only():
|
| 151 |
+
pack = load_pack(PACK)
|
| 152 |
+
tools = set(pack.base.get("tools", []) if isinstance(pack.base, dict) else [])
|
| 153 |
+
for required in ("move_units", "attack_unit", "attack_move", "stop"):
|
| 154 |
+
assert required in tools, f"missing tool: {required!r}"
|
| 155 |
+
assert "build" not in tools, "this is a combat-micro pack — no build tool"
|
| 156 |
+
|
| 157 |
+
|
| 158 |
+
def test_every_level_has_reachable_timeout_fail():
|
| 159 |
+
"""`after_ticks` fail must bite within max_turns; within_ticks+1
|
| 160 |
+
== after_ticks so a boundary non-finisher LOSES, not draws."""
|
| 161 |
+
pack = load_pack(PACK)
|
| 162 |
+
for lvl in LEVELS:
|
| 163 |
+
L = pack.levels[lvl]
|
| 164 |
+
ceiling = 93 + 90 * (L.max_turns - 1)
|
| 165 |
+
wt = next(
|
| 166 |
+
int(c["within_ticks"])
|
| 167 |
+
for c in L.win_condition.model_dump()["all_of"]
|
| 168 |
+
if "within_ticks" in c
|
| 169 |
+
)
|
| 170 |
+
ft = next(
|
| 171 |
+
int(c["after_ticks"])
|
| 172 |
+
for c in L.fail_condition.model_dump()["any_of"]
|
| 173 |
+
if "after_ticks" in c
|
| 174 |
+
)
|
| 175 |
+
assert wt < ceiling, f"{lvl}: within_ticks {wt} >= ceiling {ceiling}"
|
| 176 |
+
assert ft <= ceiling, f"{lvl}: after_ticks {ft} > ceiling {ceiling}"
|
| 177 |
+
assert wt + 1 == ft, f"{lvl}: within/after mismatch {wt}/{ft}"
|
| 178 |
+
|
| 179 |
+
|
| 180 |
+
def test_every_level_has_a_fail_condition():
|
| 181 |
+
pack = load_pack(PACK)
|
| 182 |
+
for lvl in LEVELS:
|
| 183 |
+
c = compile_level(pack, lvl)
|
| 184 |
+
assert c.fail_condition is not None, f"{lvl} needs a fail_condition"
|
| 185 |
+
|
| 186 |
+
|
| 187 |
+
def test_medium_and_hard_require_all_three_raiders():
|
| 188 |
+
"""The tightened pull bar: medium/hard win only if ALL THREE
|
| 189 |
+
raiders survive (own_units_gte:3)."""
|
| 190 |
+
pack = load_pack(PACK)
|
| 191 |
+
for lvl in ("medium", "hard"):
|
| 192 |
+
L = pack.levels[lvl]
|
| 193 |
+
bar = next(
|
| 194 |
+
int(c["own_units_gte"])
|
| 195 |
+
for c in L.win_condition.model_dump()["all_of"]
|
| 196 |
+
if "own_units_gte" in c
|
| 197 |
+
)
|
| 198 |
+
assert bar == 3, f"{lvl}: survival bar must be 3; got {bar}"
|
| 199 |
+
|
| 200 |
+
|
| 201 |
+
def test_hard_has_two_seed_driven_spawn_groups():
|
| 202 |
+
c = compile_level(load_pack(PACK), "hard")
|
| 203 |
+
sp = {
|
| 204 |
+
(a.spawn_point if a.spawn_point is not None else 0)
|
| 205 |
+
for a in c.scenario.actors
|
| 206 |
+
if a.owner == "agent"
|
| 207 |
+
}
|
| 208 |
+
assert sp == {0, 1}, f"hard must define spawn_point groups {{0,1}}; got {sorted(sp)}"
|
| 209 |
+
|
| 210 |
+
|
| 211 |
+
def test_in_bounds_actors_on_every_level():
|
| 212 |
+
pack = load_pack(PACK)
|
| 213 |
+
for lvl in LEVELS:
|
| 214 |
+
c = compile_level(pack, lvl)
|
| 215 |
+
for a in c.scenario.actors:
|
| 216 |
+
x, y = a.position
|
| 217 |
+
assert 2 <= x <= 126 and 2 <= y <= 38, (
|
| 218 |
+
f"{lvl}: actor {a.type} at ({x},{y}) out of bounds"
|
| 219 |
+
)
|
| 220 |
+
|
| 221 |
+
|
| 222 |
+
# ── predicate-level (no engine) ─────────────────────────────────────
|
| 223 |
+
|
| 224 |
+
|
| 225 |
+
def _ctx(*, tick=0, killed=0, n_units=3):
|
| 226 |
+
import types
|
| 227 |
+
|
| 228 |
+
sig = types.SimpleNamespace(
|
| 229 |
+
game_tick=tick,
|
| 230 |
+
units_killed=killed,
|
| 231 |
+
units_lost=3 - n_units,
|
| 232 |
+
own_buildings=[],
|
| 233 |
+
own_building_types=set(),
|
| 234 |
+
enemies_seen_ids=set(),
|
| 235 |
+
enemy_buildings_seen_ids=set(),
|
| 236 |
+
)
|
| 237 |
+
return WinContext(
|
| 238 |
+
signals=sig,
|
| 239 |
+
render_state={
|
| 240 |
+
"units_summary": [
|
| 241 |
+
{"cell_x": 28, "cell_y": 10} for _ in range(n_units)
|
| 242 |
+
]
|
| 243 |
+
},
|
| 244 |
+
)
|
| 245 |
+
|
| 246 |
+
|
| 247 |
+
def test_predicates_enforce_kill_and_survival():
|
| 248 |
+
pe = compile_level(load_pack(PACK), "easy")
|
| 249 |
+
# easy: kill 1, ≥2 alive, in time → WIN
|
| 250 |
+
assert evaluate(pe.win_condition, _ctx(tick=1000, killed=1, n_units=2))
|
| 251 |
+
# easy: kill 0 → not win
|
| 252 |
+
assert not evaluate(pe.win_condition, _ctx(tick=1000, killed=0, n_units=3))
|
| 253 |
+
# easy: 1 raider left → fail (need ≥2)
|
| 254 |
+
assert evaluate(pe.fail_condition, _ctx(tick=1000, killed=1, n_units=1))
|
| 255 |
+
|
| 256 |
+
pm = compile_level(load_pack(PACK), "medium")
|
| 257 |
+
# medium: all 3 alive + kill → WIN
|
| 258 |
+
assert evaluate(pm.win_condition, _ctx(tick=1000, killed=1, n_units=3))
|
| 259 |
+
# medium: only 2 alive → not win, and fail fires
|
| 260 |
+
assert not evaluate(pm.win_condition, _ctx(tick=1000, killed=1, n_units=2))
|
| 261 |
+
assert evaluate(pm.fail_condition, _ctx(tick=1000, killed=1, n_units=2))
|
| 262 |
+
# medium: past deadline → fail
|
| 263 |
+
assert evaluate(pm.fail_condition, _ctx(tick=4502, killed=0, n_units=3))
|
| 264 |
+
|
| 265 |
+
|
| 266 |
+
# ── engine-driven: every lazy/wrong policy LOSES, intended WINS ──────
|
| 267 |
+
|
| 268 |
+
|
| 269 |
+
@pytest.mark.parametrize("level", LEVELS)
|
| 270 |
+
@pytest.mark.parametrize("seed", SEEDS)
|
| 271 |
+
def test_stall_loses_every_tier_and_seed(level, seed):
|
| 272 |
+
c = compile_level(load_pack(PACK), level)
|
| 273 |
+
r = run_level(c, _stall, seed=seed)
|
| 274 |
+
assert r.outcome == "loss", (
|
| 275 |
+
f"{level}/seed{seed}: stall must LOSE; got {r.outcome} "
|
| 276 |
+
f"killed={r.signals.units_killed} lost={r.signals.units_lost}"
|
| 277 |
+
)
|
| 278 |
+
|
| 279 |
+
|
| 280 |
+
@pytest.mark.parametrize("level", LEVELS)
|
| 281 |
+
@pytest.mark.parametrize("seed", SEEDS)
|
| 282 |
+
def test_stand_and_fight_loses_every_tier_and_seed(level, seed):
|
| 283 |
+
c = compile_level(load_pack(PACK), level)
|
| 284 |
+
r = run_level(c, _stand, seed=seed)
|
| 285 |
+
assert r.outcome == "loss", (
|
| 286 |
+
f"{level}/seed{seed}: stand-and-fight must LOSE; got {r.outcome} "
|
| 287 |
+
f"killed={r.signals.units_killed} lost={r.signals.units_lost}"
|
| 288 |
+
)
|
| 289 |
+
|
| 290 |
+
|
| 291 |
+
@pytest.mark.parametrize("level", LEVELS)
|
| 292 |
+
@pytest.mark.parametrize("seed", SEEDS)
|
| 293 |
+
def test_brute_loses_every_tier_and_seed(level, seed):
|
| 294 |
+
c = compile_level(load_pack(PACK), level)
|
| 295 |
+
r = run_level(c, _brute, seed=seed)
|
| 296 |
+
assert r.outcome == "loss", (
|
| 297 |
+
f"{level}/seed{seed}: brute attack_move must LOSE; got {r.outcome} "
|
| 298 |
+
f"killed={r.signals.units_killed} lost={r.signals.units_lost}"
|
| 299 |
+
)
|
| 300 |
+
|
| 301 |
+
|
| 302 |
+
@pytest.mark.parametrize("level", LEVELS)
|
| 303 |
+
@pytest.mark.parametrize("seed", SEEDS)
|
| 304 |
+
def test_kite_wins_every_tier_and_seed(level, seed):
|
| 305 |
+
c = compile_level(load_pack(PACK), level)
|
| 306 |
+
r = run_level(c, _kite, seed=seed)
|
| 307 |
+
assert r.outcome == "win", (
|
| 308 |
+
f"{level}/seed{seed}: kite-and-pull must WIN; got {r.outcome} "
|
| 309 |
+
f"killed={r.signals.units_killed} lost={r.signals.units_lost}"
|
| 310 |
+
)
|
| 311 |
+
|
| 312 |
+
|
| 313 |
+
def test_kite_run_is_deterministic_per_seed():
|
| 314 |
+
c = compile_level(load_pack(PACK), "medium")
|
| 315 |
+
a = run_level(c, _kite, seed=2)
|
| 316 |
+
b = run_level(c, _kite, seed=2)
|
| 317 |
+
assert (a.outcome, a.turns, a.signals.units_killed) == (
|
| 318 |
+
b.outcome, b.turns, b.signals.units_killed
|
| 319 |
+
)
|
tests/test_hard_tier.py
CHANGED
|
@@ -1448,6 +1448,7 @@ UPGRADED = [
|
|
| 1448 |
"econ-quantitative-vs-qualitative-spend", # hard: 2 agent spawn_point groups
|
| 1449 |
"def-tower-line-vs-cluster", # hard: 2 agent spawn_point groups
|
| 1450 |
"coord-cover-and-move", # hard: 2 agent spawn_point groups
|
|
|
|
| 1451 |
]
|
| 1452 |
|
| 1453 |
# Consciously NOT spawn-varied, with the reason (keeps the curation
|
|
|
|
| 1448 |
"econ-quantitative-vs-qualitative-spend", # hard: 2 agent spawn_point groups
|
| 1449 |
"def-tower-line-vs-cluster", # hard: 2 agent spawn_point groups
|
| 1450 |
"coord-cover-and-move", # hard: 2 agent spawn_point groups
|
| 1451 |
+
"combat-kite-and-pull", # hard: 2 agent spawn_point groups (Wave-12)
|
| 1452 |
]
|
| 1453 |
|
| 1454 |
# Consciously NOT spawn-varied, with the reason (keeps the curation
|