Spaces:
Running
Running
fix(scenario): combat-tank-vs-tank-engagement — recalibrate after engine movement fixes
Browse files
openra_bench/scenarios/packs/combat-tank-vs-tank-engagement.yaml
CHANGED
|
@@ -1,105 +1,65 @@
|
|
| 1 |
-
# combat-tank-vs-tank-engagement —
|
| 2 |
-
#
|
| 3 |
-
#
|
| 4 |
-
#
|
| 5 |
#
|
| 6 |
# Wave-7 ACTION pack (capability: action — combat micro: target
|
| 7 |
# prioritization / focus-fire discipline).
|
| 8 |
#
|
| 9 |
# Real-world / benchmark anchors:
|
| 10 |
-
# - SC2 mirror micro
|
| 11 |
-
#
|
| 12 |
-
#
|
| 13 |
-
#
|
| 14 |
-
#
|
| 15 |
-
#
|
| 16 |
-
#
|
| 17 |
-
#
|
| 18 |
-
#
|
| 19 |
-
# of War): a smaller or equal force concentrated at the decisive
|
| 20 |
-
# point can defeat a numerically equivalent dispersed enemy.
|
| 21 |
#
|
| 22 |
-
#
|
| 23 |
-
#
|
| 24 |
-
#
|
| 25 |
-
#
|
| 26 |
-
#
|
| 27 |
-
#
|
| 28 |
-
#
|
| 29 |
-
# pack
|
| 30 |
-
#
|
| 31 |
-
#
|
| 32 |
-
#
|
| 33 |
-
#
|
| 34 |
-
#
|
| 35 |
-
#
|
| 36 |
-
#
|
| 37 |
-
#
|
| 38 |
-
#
|
| 39 |
-
#
|
| 40 |
-
#
|
| 41 |
-
#
|
| 42 |
-
#
|
| 43 |
-
#
|
| 44 |
-
#
|
| 45 |
-
#
|
| 46 |
-
# begins, the trade collapses to 1-vs-1 duels and Lanchester
|
| 47 |
-
# linear ⇒ 2 of the 3 agent tanks die in the flank engagements.
|
| 48 |
-
# * FOCUS policy (ALL 3 tanks attack_unit the SAME target in
|
| 49 |
-
# sequence — closest first, then a flank, then the last):
|
| 50 |
-
# 3-vs-1 concentrated cannon fire ends each enemy in ~1-2
|
| 51 |
-
# decision turns; after kill #1 the trade is 3-vs-2 (Lanchester
|
| 52 |
-
# surplus 3²−2² = 5), after kill #2 it is 3-vs-1; all 3 agent
|
| 53 |
-
# tanks survive.
|
| 54 |
#
|
| 55 |
-
#
|
| 56 |
-
#
|
| 57 |
-
#
|
| 58 |
-
#
|
| 59 |
-
#
|
| 60 |
-
#
|
| 61 |
-
#
|
| 62 |
-
#
|
| 63 |
-
#
|
| 64 |
-
#
|
| 65 |
-
# trips own_units_gte:1 on easy).
|
| 66 |
-
# • spread-attack-closest (each tank attack_units its own closest
|
| 67 |
-
# enemy): as above — once the centre dies, surviving tanks chase
|
| 68 |
-
# flank enemies on 1-vs-1 duels; Lanchester linear ⇒ 2 of 3 tanks
|
| 69 |
-
# die. On EASY (own_units_gte:1) the 1 survivor squeaks through
|
| 70 |
-
# and SPREAD wins (forgiving bare-skill tier, per the
|
| 71 |
-
# SCENARIO_REVIEW_CHECKLIST inert-easy-teeth convention). On
|
| 72 |
-
# MEDIUM (own_units_gte:2) the 1 survivor is below the bar ⇒
|
| 73 |
-
# LOSS — this is the load-bearing discrimination.
|
| 74 |
-
# • intended focus-fire (ALL 3 tanks attack_unit the SAME target
|
| 75 |
-
# each turn, starting with the closest enemy by agent centroid,
|
| 76 |
-
# then re-targeting the next-closest as enemies fall): all 3
|
| 77 |
-
# enemies die in ~700-900 ticks, all 3 agent tanks alive at the
|
| 78 |
-
# end ⇒ WIN on every level.
|
| 79 |
-
#
|
| 80 |
-
# Win-bar relaxation note (RELAXED per the task brief): on HARD the
|
| 81 |
-
# survival cap holds at own_units_gte:2 nominally, but the asymmetric
|
| 82 |
-
# discrimination weakens when the agent stack starts on a FLANK
|
| 83 |
-
# latitude (NORTH y=11..13 or SOUTH y=27..29) — from a flank the
|
| 84 |
-
# enemy line at y=15/20/25 has a unique closest enemy that all agent
|
| 85 |
-
# tanks naturally target (spread ≡ focus). Hard's discrimination is
|
| 86 |
-
# therefore primarily KILL-SPEED (within_ticks 1200) + brute / stall
|
| 87 |
-
# anti-cheat teeth + spawn-variation generalisation across NORTH and
|
| 88 |
-
# SOUTH approach axes — the focus-fire skill is what generalises;
|
| 89 |
-
# spread-as-focus on a flank is acceptable because it IS the
|
| 90 |
-
# intended capability when the geometry collapses to a unique
|
| 91 |
-
# closest target.
|
| 92 |
#
|
| 93 |
# Hard-tier spawn-variation (≥2 spawn_point groups, registered in
|
| 94 |
# tests/test_hard_tier.py::UPGRADED):
|
| 95 |
# - NORTH staging y=11..13 (agent at (30,11..13)).
|
| 96 |
# - SOUTH staging y=27..29 (agent at (30,27..29)).
|
| 97 |
-
# The
|
| 98 |
-
#
|
| 99 |
-
#
|
| 100 |
-
# closest enemy is (50,15) and the farthest is (50,25); from SOUTH
|
| 101 |
-
# the order inverts. A memorised single-target sequence cannot
|
| 102 |
-
# generalise across the spawn rotation.
|
| 103 |
#
|
| 104 |
# Engine guardrails (per CLAUDE.md):
|
| 105 |
# - Map: rush-hour-arena (128 × 40, playable x ∈ [2..126],
|
|
@@ -116,20 +76,18 @@
|
|
| 116 |
# "Certain mid-map cells silently fail to place enemy clusters
|
| 117 |
# (e.g. (50,20))"; (51,20) is a documented working cell.
|
| 118 |
# - `within_ticks: 2400` / `after_ticks: 2401` on easy+medium;
|
| 119 |
-
# max_turns=30 produces tick ≤ 93 + 90·29 = 2703 ⇒
|
| 120 |
-
# brute
|
| 121 |
# `within_ticks: 1200` / `after_ticks: 1201` and max_turns=15
|
| 122 |
# (tick ≤ 93 + 90·14 = 1353 ≥ 1201) — kill-speed pressure for
|
| 123 |
# the focus-fire policy.
|
| 124 |
# - Enemy `bot_type: ''` (no scripted bot pursuit) — enemy tanks
|
| 125 |
# sit on stance:2 Defend so they auto-fire the second a tank
|
| 126 |
# enters cannon range but NEVER advance; the enemy line stays
|
| 127 |
-
# STATIONARY on its
|
| 128 |
-
#
|
| 129 |
-
#
|
| 130 |
-
#
|
| 131 |
-
# spread-fire wrong-play into focus-fire and collapses the
|
| 132 |
-
# discrimination — stance:2 keeps the spread geometry intact.
|
| 133 |
# - Agent tanks stance:1 ReturnFire so a stall policy (pure observe,
|
| 134 |
# no movement) doesn't accidentally pull fire from any agent tank
|
| 135 |
# before the enemy is in range — the stall remains a clean
|
|
@@ -140,22 +98,23 @@ meta:
|
|
| 140 |
title: 'Tank-vs-Tank Mirror — Focus-Fire, Lanchester Square Law'
|
| 141 |
capability: action
|
| 142 |
real_world_meaning: >
|
| 143 |
-
|
| 144 |
-
|
| 145 |
-
|
| 146 |
-
|
| 147 |
-
|
| 148 |
-
|
| 149 |
-
|
| 150 |
-
|
| 151 |
-
|
| 152 |
-
|
| 153 |
-
|
| 154 |
-
|
| 155 |
-
|
| 156 |
-
|
| 157 |
-
|
| 158 |
-
the survival cap
|
|
|
|
| 159 |
robotics_analogue: >
|
| 160 |
Military "concentration of force" doctrine (one of the Principles
|
| 161 |
of War): a smaller or equal force concentrated at the decisive
|
|
@@ -194,23 +153,21 @@ levels:
|
|
| 194 |
# Bare focus-fire skill: 3-vs-3 asymmetric mirror, survival bar ≥1
|
| 195 |
# (forgiving — even if focus-fire loses 2 tanks in the trade, ≥1
|
| 196 |
# alive suffices). Stall LOSES (kill bar unmet → after_ticks LOSS).
|
| 197 |
-
# Brute attack-move LOSES (drives into
|
| 198 |
-
# wipes).
|
| 199 |
-
#
|
| 200 |
-
# discrimination is at medium.
|
| 201 |
easy:
|
| 202 |
description: >
|
| 203 |
Three medium tanks (2tnk, allies) at (30,19..21) face THREE
|
| 204 |
-
enemy medium tanks (2tnk, soviet)
|
| 205 |
-
|
| 206 |
-
range
|
| 207 |
-
|
| 208 |
-
|
| 209 |
-
|
| 210 |
-
the
|
| 211 |
-
|
| 212 |
-
|
| 213 |
-
is intact, before tick 2400.
|
| 214 |
overrides:
|
| 215 |
actors:
|
| 216 |
# Agent base anchor (paranoia gate against the turn-1
|
|
@@ -223,16 +180,11 @@ levels:
|
|
| 223 |
- {type: 2tnk, owner: agent, position: [30, 19], stance: 1}
|
| 224 |
- {type: 2tnk, owner: agent, position: [30, 20], stance: 1}
|
| 225 |
- {type: 2tnk, owner: agent, position: [30, 21], stance: 1}
|
| 226 |
-
# Enemy
|
| 227 |
-
#
|
| 228 |
-
#
|
| 229 |
-
#
|
| 230 |
-
#
|
| 231 |
-
# tanks HUNT and BUNCH onto the agent column, so the
|
| 232 |
-
# spread-fire wrong-play degenerates into focus-fire and the
|
| 233 |
-
# discrimination collapses; stance:2 keeps the enemy line
|
| 234 |
-
# STATIONARY on its three latitudes so spread-fire genuinely
|
| 235 |
-
# fans the agent tanks into 1-vs-1 flank duels).
|
| 236 |
- {type: 2tnk, owner: enemy, position: [50, 15], stance: 2}
|
| 237 |
- {type: 2tnk, owner: enemy, position: [51, 20], stance: 2}
|
| 238 |
- {type: 2tnk, owner: enemy, position: [50, 25], stance: 2}
|
|
@@ -253,32 +205,45 @@ levels:
|
|
| 253 |
max_turns: 30
|
| 254 |
|
| 255 |
# ── MEDIUM ──────────────────────────────────────────────────────────
|
| 256 |
-
# +1 controlled variable vs easy:
|
| 257 |
-
#
|
| 258 |
-
#
|
| 259 |
-
#
|
| 260 |
-
#
|
| 261 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 262 |
medium:
|
| 263 |
description: >
|
| 264 |
-
Three medium tanks (2tnk, allies) at (30,19..21) face
|
| 265 |
-
enemy medium tanks (2tnk, soviet)
|
| 266 |
-
|
| 267 |
-
|
| 268 |
-
|
| 269 |
-
|
| 270 |
-
|
| 271 |
-
|
| 272 |
-
|
|
|
|
|
|
|
| 273 |
overrides:
|
| 274 |
actors:
|
| 275 |
- {type: fact, owner: agent, position: [4, 20]}
|
| 276 |
- {type: 2tnk, owner: agent, position: [30, 19], stance: 1}
|
| 277 |
- {type: 2tnk, owner: agent, position: [30, 20], stance: 1}
|
| 278 |
- {type: 2tnk, owner: agent, position: [30, 21], stance: 1}
|
| 279 |
-
|
| 280 |
-
|
| 281 |
-
- {type: 2tnk, owner: enemy, position: [50,
|
|
|
|
|
|
|
|
|
|
| 282 |
- {type: fact, owner: enemy, position: [124, 20]}
|
| 283 |
win_condition:
|
| 284 |
all_of:
|
|
@@ -295,39 +260,28 @@ levels:
|
|
| 295 |
# ── HARD ────────────────────────────────────────────────────────────
|
| 296 |
# +2 controlled variables vs medium:
|
| 297 |
# 1. KILL-SPEED PRESSURE — within_ticks tightens from 2400 to
|
| 298 |
-
# 1200
|
| 299 |
-
#
|
| 300 |
-
#
|
| 301 |
-
# the kill-speed timer becomes the load-bearing
|
| 302 |
-
# discriminator). Focus-fire ends the engagement in
|
| 303 |
-
# ~700-1000 ticks (3 cannons on 1 target each turn); brute
|
| 304 |
-
# drive-into-crossfire and stall both fail the clock.
|
| 305 |
# 2. TWO seed-driven spawn_point groups (NORTH staging y=11..13
|
| 306 |
-
# vs SOUTH staging y=27..29) round-robined by seed so
|
| 307 |
-
#
|
| 308 |
-
#
|
| 309 |
-
#
|
| 310 |
-
# The survival cap RELAXES to own_units_gte:1 on hard (per task
|
| 311 |
-
# brief): on a flank spawn the spread-fire policy naturally
|
| 312 |
-
# focus-fires the unique closest enemy, so the spread-vs-focus
|
| 313 |
-
# delta on hard is primarily kill-speed (within_ticks) rather
|
| 314 |
-
# than survivor count.
|
| 315 |
hard:
|
| 316 |
description: >
|
| 317 |
Three medium tanks (2tnk, allies) stage at ONE of two
|
| 318 |
staging corridors (NORTH y=11..13 OR SOUTH y=27..29, chosen
|
| 319 |
by seed, anti-memorisation), all bunched at x=30 on adjacent
|
| 320 |
rows. They face THREE enemy medium tanks (2tnk, soviet)
|
| 321 |
-
|
| 322 |
-
|
| 323 |
-
|
| 324 |
-
|
| 325 |
-
|
| 326 |
-
enemy tanks are killed AND at
|
| 327 |
-
survives AND your base is intact,
|
| 328 |
-
|
| 329 |
-
anything slower than concentrated focus-fire busts the
|
| 330 |
-
clock).
|
| 331 |
overrides:
|
| 332 |
actors:
|
| 333 |
# Agent base anchor — duplicated under BOTH spawn_point
|
|
|
|
| 1 |
+
# combat-tank-vs-tank-engagement — tank trade: WIN by a controlled
|
| 2 |
+
# focus-fire `attack_unit` engagement (close to cannon range, HOLD,
|
| 3 |
+
# concentrate fire one target at a time), LOSE by a brute
|
| 4 |
+
# `attack_move` drive straight into the enemy position.
|
| 5 |
#
|
| 6 |
# Wave-7 ACTION pack (capability: action — combat micro: target
|
| 7 |
# prioritization / focus-fire discipline).
|
| 8 |
#
|
| 9 |
# Real-world / benchmark anchors:
|
| 10 |
+
# - SC2 mirror micro: the side that holds and concentrates fire one
|
| 11 |
+
# target at a time clears the line keeping its strength; the side
|
| 12 |
+
# that charges in eats the whole line's crossfire and is wiped.
|
| 13 |
+
# - Lanchester's SQUARE LAW: per-kill removal of one enemy's OUTPUT
|
| 14 |
+
# DPS — a held, concentrated engagement removes enemy firepower
|
| 15 |
+
# a whole tank at a time.
|
| 16 |
+
# - Military "CONCENTRATION OF FORCE" doctrine (one of the
|
| 17 |
+
# Principles of War): a force fighting at a controlled engagement
|
| 18 |
+
# range defeats one that throws itself into the enemy's midst.
|
|
|
|
|
|
|
| 19 |
#
|
| 20 |
+
# RECALIBRATION FINDING (engine movement fixes — moving units take
|
| 21 |
+
# fire en route, attack_unit on out-of-sight targets paths normally
|
| 22 |
+
# at real Mobile speed, no sprint-invincibility):
|
| 23 |
+
# With the post-fix combat model a SYMMETRIC tank mirror is a flat
|
| 24 |
+
# meat-grinder — whatever the target assignment (concentrate on one
|
| 25 |
+
# target, or each tank its own nearest), the closing force loses
|
| 26 |
+
# exactly the same number of tanks. The symmetric-mirror
|
| 27 |
+
# focus-vs-spread SURVIVOR delta the pack originally relied on no
|
| 28 |
+
# longer exists in the engine (a per-tank-own-nearest policy ends
|
| 29 |
+
# identically to a single-target focus policy). Concentrating fire
|
| 30 |
+
# on a bunched stack ALSO bunches the stack's exposure — there is
|
| 31 |
+
# no free square-law surplus.
|
| 32 |
+
# The load-bearing discrimination is therefore CONTROLLED
|
| 33 |
+
# ENGAGEMENT vs BRUTE drive-in:
|
| 34 |
+
# * Intended (focus-fire `attack_unit`): the order closes the
|
| 35 |
+
# force to cannon range and HOLDS there — the agent fires from
|
| 36 |
+
# range and works down the enemy line. Clears the line keeping
|
| 37 |
+
# its strength ⇒ WIN.
|
| 38 |
+
# * Brute (`attack_move` onto the enemy cell): drives the column
|
| 39 |
+
# INTO the enemy position; the stack is enveloped, absorbs the
|
| 40 |
+
# whole line's crossfire at once, and force-wipes before
|
| 41 |
+
# clearing 3 kills ⇒ LOSS.
|
| 42 |
+
# * Stall (only observe): never closes; nothing dies; kill bar
|
| 43 |
+
# unmet ⇒ after_ticks LOSS.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 44 |
#
|
| 45 |
+
# Difficulty axis (one controlled variable per tier):
|
| 46 |
+
# - EASY — 3-vs-3. Bare engagement skill; survival bar ≥1.
|
| 47 |
+
# - MEDIUM — 4-vs-3 (a FOURTH enemy tank; the agent is numerically
|
| 48 |
+
# out-gunned). A held focus engagement clears ≥3 of the 4 enemy
|
| 49 |
+
# tanks while keeping ≥2 of its own; the brute drive-in is wiped
|
| 50 |
+
# by the 4-tank crossfire before killing 3. This over-match is
|
| 51 |
+
# the load-bearing discrimination.
|
| 52 |
+
# - HARD — 3-vs-3 with a tight kill-speed deadline (within_ticks
|
| 53 |
+
# 1200) and two seed-driven spawn corridors (NORTH y=11..13 /
|
| 54 |
+
# SOUTH y=27..29) so the approach axis can't be memorised.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 55 |
#
|
| 56 |
# Hard-tier spawn-variation (≥2 spawn_point groups, registered in
|
| 57 |
# tests/test_hard_tier.py::UPGRADED):
|
| 58 |
# - NORTH staging y=11..13 (agent at (30,11..13)).
|
| 59 |
# - SOUTH staging y=27..29 (agent at (30,27..29)).
|
| 60 |
+
# The enemy line (3 enemies at y=15/y=20/y=25) is the SAME for both
|
| 61 |
+
# spawns (enemy actors don't honour spawn_point per CLAUDE.md /
|
| 62 |
+
# oramap.rs::expand_scenario_actors).
|
|
|
|
|
|
|
|
|
|
| 63 |
#
|
| 64 |
# Engine guardrails (per CLAUDE.md):
|
| 65 |
# - Map: rush-hour-arena (128 × 40, playable x ∈ [2..126],
|
|
|
|
| 76 |
# "Certain mid-map cells silently fail to place enemy clusters
|
| 77 |
# (e.g. (50,20))"; (51,20) is a documented working cell.
|
| 78 |
# - `within_ticks: 2400` / `after_ticks: 2401` on easy+medium;
|
| 79 |
+
# max_turns=30 produces tick ≤ 93 + 90·29 = 2703 ⇒ stall /
|
| 80 |
+
# brute hit the real LOSS, not a DRAW. Hard uses
|
| 81 |
# `within_ticks: 1200` / `after_ticks: 1201` and max_turns=15
|
| 82 |
# (tick ≤ 93 + 90·14 = 1353 ≥ 1201) — kill-speed pressure for
|
| 83 |
# the focus-fire policy.
|
| 84 |
# - Enemy `bot_type: ''` (no scripted bot pursuit) — enemy tanks
|
| 85 |
# sit on stance:2 Defend so they auto-fire the second a tank
|
| 86 |
# enters cannon range but NEVER advance; the enemy line stays
|
| 87 |
+
# STATIONARY on its latitudes. stance:3 AttackAnything would
|
| 88 |
+
# make the enemy tanks hunt and chase the agent — stance:2
|
| 89 |
+
# keeps the line in place so the engagement is a clean
|
| 90 |
+
# close-and-trade against a fixed objective.
|
|
|
|
|
|
|
| 91 |
# - Agent tanks stance:1 ReturnFire so a stall policy (pure observe,
|
| 92 |
# no movement) doesn't accidentally pull fire from any agent tank
|
| 93 |
# before the enemy is in range — the stall remains a clean
|
|
|
|
| 98 |
title: 'Tank-vs-Tank Mirror — Focus-Fire, Lanchester Square Law'
|
| 99 |
capability: action
|
| 100 |
real_world_meaning: >
|
| 101 |
+
A three-tank strike force engages a stationary enemy tank line.
|
| 102 |
+
The decision under test is combat micro: close to cannon range,
|
| 103 |
+
HOLD the engagement at range, and concentrate `attack_unit` fire
|
| 104 |
+
on one target at a time — eliminate the nearest enemy, then the
|
| 105 |
+
next, working down the line. Per the "concentration of force"
|
| 106 |
+
doctrine and the Lanchester square law, a force that holds and
|
| 107 |
+
focus-fires removes enemy OUTPUT DPS one whole tank per kill and
|
| 108 |
+
clears the line keeping its strength; a force that brute
|
| 109 |
+
`attack_move`s straight INTO the enemy position bunches itself in
|
| 110 |
+
the enemy's midst, absorbs the whole line's crossfire at once,
|
| 111 |
+
and is wiped before it can clear the engagement. On medium the
|
| 112 |
+
agent is numerically out-gunned 4-vs-3, so the controlled
|
| 113 |
+
engagement is load-bearing: only a held, concentrated focus-fire
|
| 114 |
+
push clears ≥3 of the 4 enemy tanks while keeping ≥2 of its own.
|
| 115 |
+
Stalling never engages and loses on the kill bar; the brute
|
| 116 |
+
drive-in loses on the survival cap / kill bar; only the
|
| 117 |
+
controlled focus-fire engagement wins.
|
| 118 |
robotics_analogue: >
|
| 119 |
Military "concentration of force" doctrine (one of the Principles
|
| 120 |
of War): a smaller or equal force concentrated at the decisive
|
|
|
|
| 153 |
# Bare focus-fire skill: 3-vs-3 asymmetric mirror, survival bar ≥1
|
| 154 |
# (forgiving — even if focus-fire loses 2 tanks in the trade, ≥1
|
| 155 |
# alive suffices). Stall LOSES (kill bar unmet → after_ticks LOSS).
|
| 156 |
+
# Brute attack-move LOSES (drives into the 3-tank crossfire and
|
| 157 |
+
# force-wipes). The bare engagement skill: close to cannon range
|
| 158 |
+
# and clear the line with a controlled focus-fire engagement.
|
|
|
|
| 159 |
easy:
|
| 160 |
description: >
|
| 161 |
Three medium tanks (2tnk, allies) at (30,19..21) face THREE
|
| 162 |
+
enemy medium tanks (2tnk, soviet) along the eastern line at
|
| 163 |
+
(50,15), (51,20), and (50,25). Close to firing range (cannon
|
| 164 |
+
range ~5), HOLD the engagement at range, and `attack_unit` the
|
| 165 |
+
enemy tanks down one at a time — start with the nearest. Do
|
| 166 |
+
NOT drive the column straight onto the enemy position: an
|
| 167 |
+
attack-move into their midst bunches you in the crossfire and
|
| 168 |
+
wipes the force. Win when all 3 enemy tanks are killed AND at
|
| 169 |
+
least ONE of your tanks survives AND your base is intact,
|
| 170 |
+
before tick 2400.
|
|
|
|
| 171 |
overrides:
|
| 172 |
actors:
|
| 173 |
# Agent base anchor (paranoia gate against the turn-1
|
|
|
|
| 180 |
- {type: 2tnk, owner: agent, position: [30, 19], stance: 1}
|
| 181 |
- {type: 2tnk, owner: agent, position: [30, 20], stance: 1}
|
| 182 |
- {type: 2tnk, owner: agent, position: [30, 21], stance: 1}
|
| 183 |
+
# Enemy line — 3 medium tanks across y=15/y=20/y=25. Centre
|
| 184 |
+
# at (51,20) NOT (50,20) per CLAUDE.md silent-fail cell note.
|
| 185 |
+
# stance:2 Defend — auto-fire on the closest in-range enemy
|
| 186 |
+
# but NEVER advance, so the line stays a fixed engagement
|
| 187 |
+
# objective (a clean close-and-trade, not a chase).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 188 |
- {type: 2tnk, owner: enemy, position: [50, 15], stance: 2}
|
| 189 |
- {type: 2tnk, owner: enemy, position: [51, 20], stance: 2}
|
| 190 |
- {type: 2tnk, owner: enemy, position: [50, 25], stance: 2}
|
|
|
|
| 205 |
max_turns: 30
|
| 206 |
|
| 207 |
# ── MEDIUM ──────────────────────────────────────────────────────────
|
| 208 |
+
# +1 controlled variable vs easy: a FOURTH enemy tank (4-vs-3,
|
| 209 |
+
# numerically OUT-gunned) plus a survival bar of ≥2. With the
|
| 210 |
+
# post-movement-fix engine a 3-vs-3 mirror is a flat meat-grinder
|
| 211 |
+
# (whatever the targeting, the agent loses exactly 2 tanks — the
|
| 212 |
+
# symmetric-mirror focus-vs-spread survivor delta the pack
|
| 213 |
+
# originally relied on no longer exists). The load-bearing
|
| 214 |
+
# discrimination is therefore CONTROLLED ENGAGEMENT vs BRUTE
|
| 215 |
+
# drive-in: a focus-fire `attack_unit` engagement closes to cannon
|
| 216 |
+
# range, holds, and concentrates fire — clears ≥3 of the 4 enemy
|
| 217 |
+
# tanks while keeping the whole strike force; a brute
|
| 218 |
+
# `attack_move` drive INTO the 4-tank position bunches the column
|
| 219 |
+
# in the enemy's midst, eats 4-tank crossfire, and force-wipes
|
| 220 |
+
# before killing 3. Win = kill ≥3 enemy tanks AND keep ≥2 of your
|
| 221 |
+
# own, before tick 2400.
|
| 222 |
medium:
|
| 223 |
description: >
|
| 224 |
+
Three medium tanks (2tnk, allies) at (30,19..21) face FOUR
|
| 225 |
+
enemy medium tanks (2tnk, soviet) along the eastern line at
|
| 226 |
+
(50,14), (51,18), (50,22), and (51,26) — you are outnumbered
|
| 227 |
+
4-vs-3. Close to cannon range (~5) and concentrate fire:
|
| 228 |
+
`attack_unit` the nearest enemy, hold the engagement at range,
|
| 229 |
+
and eliminate the enemy line one tank at a time. Driving the
|
| 230 |
+
column straight INTO the enemy position (a brute attack-move)
|
| 231 |
+
bunches you in their crossfire and wipes the force before it
|
| 232 |
+
clears the line. Win when at least 3 enemy tanks are killed
|
| 233 |
+
AND at least TWO of your tanks survive AND your base is
|
| 234 |
+
intact, before tick 2400.
|
| 235 |
overrides:
|
| 236 |
actors:
|
| 237 |
- {type: fact, owner: agent, position: [4, 20]}
|
| 238 |
- {type: 2tnk, owner: agent, position: [30, 19], stance: 1}
|
| 239 |
- {type: 2tnk, owner: agent, position: [30, 20], stance: 1}
|
| 240 |
- {type: 2tnk, owner: agent, position: [30, 21], stance: 1}
|
| 241 |
+
# Enemy line — FOUR tanks (4-vs-3 over-match). stance:2 Defend
|
| 242 |
+
# (stationary line; see the easy/hard comment).
|
| 243 |
+
- {type: 2tnk, owner: enemy, position: [50, 14], stance: 2}
|
| 244 |
+
- {type: 2tnk, owner: enemy, position: [51, 18], stance: 2}
|
| 245 |
+
- {type: 2tnk, owner: enemy, position: [50, 22], stance: 2}
|
| 246 |
+
- {type: 2tnk, owner: enemy, position: [51, 26], stance: 2}
|
| 247 |
- {type: fact, owner: enemy, position: [124, 20]}
|
| 248 |
win_condition:
|
| 249 |
all_of:
|
|
|
|
| 260 |
# ── HARD ────────────────────────────────────────────────────────────
|
| 261 |
# +2 controlled variables vs medium:
|
| 262 |
# 1. KILL-SPEED PRESSURE — within_ticks tightens from 2400 to
|
| 263 |
+
# 1200. A controlled focus-fire engagement ends the
|
| 264 |
+
# 3-vs-3 trade in ~800-1000 ticks; stall and the brute
|
| 265 |
+
# drive-into-crossfire both fail the clock.
|
|
|
|
|
|
|
|
|
|
|
|
|
| 266 |
# 2. TWO seed-driven spawn_point groups (NORTH staging y=11..13
|
| 267 |
+
# vs SOUTH staging y=27..29) round-robined by seed so the
|
| 268 |
+
# approach axis cannot be memorised.
|
| 269 |
+
# The survival cap is own_units_gte:1 on hard (the kill-speed
|
| 270 |
+
# deadline is the binding discriminator at this tier).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 271 |
hard:
|
| 272 |
description: >
|
| 273 |
Three medium tanks (2tnk, allies) stage at ONE of two
|
| 274 |
staging corridors (NORTH y=11..13 OR SOUTH y=27..29, chosen
|
| 275 |
by seed, anti-memorisation), all bunched at x=30 on adjacent
|
| 276 |
rows. They face THREE enemy medium tanks (2tnk, soviet)
|
| 277 |
+
along the eastern line at (50,15), (51,20), and (50,25).
|
| 278 |
+
Close to cannon range, HOLD the engagement, and `attack_unit`
|
| 279 |
+
the enemy tanks down one at a time — fast. A brute attack-move
|
| 280 |
+
into the enemy position is wiped in the crossfire; stalling or
|
| 281 |
+
anything slower than a controlled focus-fire push busts the
|
| 282 |
+
tight clock. Win when all 3 enemy tanks are killed AND at
|
| 283 |
+
least ONE of your tanks survives AND your base is intact,
|
| 284 |
+
before tick 1200.
|
|
|
|
|
|
|
| 285 |
overrides:
|
| 286 |
actors:
|
| 287 |
# Agent base anchor — duplicated under BOTH spawn_point
|
tests/test_combat_tank_vs_tank_engagement.py
CHANGED
|
@@ -1,33 +1,37 @@
|
|
| 1 |
-
"""combat-tank-vs-tank-engagement —
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
|
| 12 |
-
|
| 13 |
-
|
| 14 |
-
the
|
| 15 |
-
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
|
| 21 |
-
|
| 22 |
-
|
| 23 |
-
|
| 24 |
-
|
| 25 |
-
|
| 26 |
-
|
| 27 |
-
|
| 28 |
-
|
| 29 |
-
|
| 30 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 31 |
|
| 32 |
Validation is scripted (no model / network).
|
| 33 |
"""
|
|
@@ -171,32 +175,33 @@ def test_hard_has_two_spawn_point_groups():
|
|
| 171 |
assert len(groups) >= 2, f"hard needs ≥2 spawn_point groups, got {groups}"
|
| 172 |
|
| 173 |
|
| 174 |
-
def
|
| 175 |
-
"""The
|
| 176 |
-
|
| 177 |
-
|
| 178 |
-
|
| 179 |
-
x=50) per the CLAUDE.md silent-fail-cell note for (50,20)."""
|
| 180 |
pack = load_pack(PACK_PATH)
|
|
|
|
| 181 |
for lvl in ("easy", "medium", "hard"):
|
| 182 |
c = compile_level(pack, lvl)
|
| 183 |
enemy_tanks = [
|
| 184 |
a for a in c.scenario.actors
|
| 185 |
if a.owner == "enemy" and a.type == "2tnk"
|
| 186 |
]
|
| 187 |
-
assert len(enemy_tanks) ==
|
| 188 |
-
f"{lvl}: must have exactly
|
|
|
|
| 189 |
)
|
| 190 |
ys = sorted(a.position[1] for a in enemy_tanks)
|
| 191 |
-
assert len(set(ys)) ==
|
| 192 |
-
f"{lvl}: enemy tanks must be on
|
| 193 |
-
f"
|
| 194 |
)
|
| 195 |
# Verify the (50,20) silent-fail cell is NOT used.
|
| 196 |
positions = [tuple(a.position) for a in enemy_tanks]
|
| 197 |
assert (50, 20) not in positions, (
|
| 198 |
f"{lvl}: (50,20) is a CLAUDE.md-documented silent-fail "
|
| 199 |
-
f"cell
|
| 200 |
)
|
| 201 |
types = [a.type for a in c.scenario.actors if a.owner == "enemy"]
|
| 202 |
assert "fact" in types, f"{lvl}: needs a persistent enemy fact"
|
|
@@ -261,42 +266,17 @@ def _stall(rs, Command):
|
|
| 261 |
|
| 262 |
|
| 263 |
def _brute_attack_move(rs, Command):
|
| 264 |
-
"""Brute: every tank attack_moves
|
| 265 |
-
|
| 266 |
-
|
|
|
|
|
|
|
| 267 |
own = _own_ids(rs)
|
| 268 |
if not own:
|
| 269 |
return [Command.observe()]
|
| 270 |
return [Command.attack_move(own, 51, 20)]
|
| 271 |
|
| 272 |
|
| 273 |
-
def _spread_attack_closest(rs, Command):
|
| 274 |
-
"""Spread: each agent tank attack_units ITS OWN nearest visible
|
| 275 |
-
enemy tank. With the asymmetric spread (3 enemies on three rows),
|
| 276 |
-
once the centre dies the surviving agent tanks chase different
|
| 277 |
-
flank enemies in 1-vs-1 duels — Lanchester linear law collapses
|
| 278 |
-
the trade to mutual annihilation, ending with 1-of-3 alive. On
|
| 279 |
-
MEDIUM (own_units_gte:2) this busts the survival cap ⇒ LOSS."""
|
| 280 |
-
own = _own_ids(rs)
|
| 281 |
-
if not own:
|
| 282 |
-
return [Command.observe()]
|
| 283 |
-
es = _enemy_tanks(rs)
|
| 284 |
-
if not es:
|
| 285 |
-
# No targets in sight — advance to contact.
|
| 286 |
-
return [Command.attack_move(own, 51, 20)]
|
| 287 |
-
cmds = []
|
| 288 |
-
for u in (rs.get("units_summary") or []):
|
| 289 |
-
uid = str(u["id"])
|
| 290 |
-
ux, uy = u["cell_x"], u["cell_y"]
|
| 291 |
-
es_sorted = sorted(
|
| 292 |
-
es, key=lambda e: (e["cell_x"] - ux) ** 2 + (e["cell_y"] - uy) ** 2
|
| 293 |
-
)
|
| 294 |
-
tid = es_sorted[0].get("id")
|
| 295 |
-
if tid is not None:
|
| 296 |
-
cmds.append(Command.attack_unit([uid], str(tid)))
|
| 297 |
-
return cmds or [Command.observe()]
|
| 298 |
-
|
| 299 |
-
|
| 300 |
def _focus_fire(rs, Command):
|
| 301 |
"""Focus-fire: ALL agent tanks attack_unit the SAME target each
|
| 302 |
turn — the closest enemy to the agent centroid. Once that enemy
|
|
@@ -370,28 +350,30 @@ def test_brute_attack_move_loses(level, seed):
|
|
| 370 |
)
|
| 371 |
|
| 372 |
|
| 373 |
-
@pytest.mark.parametrize("level", ["medium"])
|
| 374 |
@pytest.mark.parametrize("seed", [1, 2, 3, 4])
|
| 375 |
-
def
|
| 376 |
-
"""
|
| 377 |
-
|
| 378 |
-
the
|
| 379 |
-
|
| 380 |
-
|
| 381 |
-
|
| 382 |
-
|
| 383 |
-
|
| 384 |
-
that all 3 agent tanks naturally target (spread ≡ focus); the
|
| 385 |
-
hard discrimination is kill-speed + spawn-variation, not the
|
| 386 |
-
survivor-count delta."""
|
| 387 |
pytest.importorskip("openra_train")
|
| 388 |
from openra_bench.eval_core import run_level
|
| 389 |
|
| 390 |
c = compile_level(load_pack(PACK_PATH), level)
|
| 391 |
-
|
| 392 |
-
|
| 393 |
-
|
| 394 |
-
f"
|
| 395 |
-
f"got {
|
| 396 |
-
f"losses={
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 397 |
)
|
|
|
|
| 1 |
+
"""combat-tank-vs-tank-engagement — tank trade: a controlled
|
| 2 |
+
focus-fire `attack_unit` engagement WINS; STALL and a BRUTE
|
| 3 |
+
`attack_move` drive-in LOSE.
|
| 4 |
+
|
| 5 |
+
The bar: the intended FOCUS-fire engagement (close to cannon range,
|
| 6 |
+
hold, concentrate `attack_unit` fire on one target at a time) WINS on
|
| 7 |
+
every level and every hard seed (1-4); STALL (pure observe) and a
|
| 8 |
+
BRUTE `attack_move` drive straight INTO the enemy position LOSE on
|
| 9 |
+
every level and every hard seed. Non-win is a real reachable timeout
|
| 10 |
+
LOSS via the `after_ticks` fail clause (within_ticks 2400 +
|
| 11 |
+
after_ticks 2401 on easy/medium with max_turns 30; within_ticks 1200
|
| 12 |
+
+ after_ticks 1201 on hard with max_turns 15).
|
| 13 |
+
|
| 14 |
+
Recalibrated after the engine movement fixes (moving units take fire
|
| 15 |
+
en route; `attack_unit` on out-of-sight targets paths normally at
|
| 16 |
+
real Mobile speed; no sprint-invincibility). Finding from this
|
| 17 |
+
recalibration: with the post-fix combat model a SYMMETRIC 3-vs-3
|
| 18 |
+
tank mirror is a flat meat-grinder — whatever the target assignment
|
| 19 |
+
(focus one target, or each tank its own nearest), the agent loses
|
| 20 |
+
exactly two tanks closing the distance. The symmetric-mirror
|
| 21 |
+
focus-vs-spread SURVIVOR delta the pack originally relied on no
|
| 22 |
+
longer exists in the engine (a `spread_closest` policy ends
|
| 23 |
+
identically to focus). The load-bearing discrimination is therefore
|
| 24 |
+
CONTROLLED ENGAGEMENT vs BRUTE drive-in, and the difficulty axis is
|
| 25 |
+
re-tuned:
|
| 26 |
+
* EASY — 3-vs-3. Focus `attack_unit` closes to cannon range and
|
| 27 |
+
clears the line (≥1 survivor); a brute `attack_move` onto the
|
| 28 |
+
enemy cell bunches the column in melee and force-wipes.
|
| 29 |
+
* MEDIUM — 4-vs-3 (a fourth enemy tank, the agent is
|
| 30 |
+
numerically out-gunned). A controlled focus engagement clears
|
| 31 |
+
≥3 of the 4 enemy tanks while keeping ≥2 of its own; a brute
|
| 32 |
+
drive-in eats 4-tank crossfire and wipes before killing 3.
|
| 33 |
+
* HARD — 3-vs-3 with a tight kill-speed deadline (within_ticks
|
| 34 |
+
1200) and two seed-driven spawn corridors (NORTH / SOUTH).
|
| 35 |
|
| 36 |
Validation is scripted (no model / network).
|
| 37 |
"""
|
|
|
|
| 175 |
assert len(groups) >= 2, f"hard needs ≥2 spawn_point groups, got {groups}"
|
| 176 |
|
| 177 |
|
| 178 |
+
def test_enemy_line_is_a_spread_tank_line():
|
| 179 |
+
"""The enemy line MUST be a spread tank line on distinct
|
| 180 |
+
latitudes (each enemy independently targetable): 3 tanks on
|
| 181 |
+
easy/hard, 4 on medium (the 4-vs-3 over-match). The (50,20)
|
| 182 |
+
silent-fail cell must not be used."""
|
|
|
|
| 183 |
pack = load_pack(PACK_PATH)
|
| 184 |
+
expected = {"easy": 3, "medium": 4, "hard": 3}
|
| 185 |
for lvl in ("easy", "medium", "hard"):
|
| 186 |
c = compile_level(pack, lvl)
|
| 187 |
enemy_tanks = [
|
| 188 |
a for a in c.scenario.actors
|
| 189 |
if a.owner == "enemy" and a.type == "2tnk"
|
| 190 |
]
|
| 191 |
+
assert len(enemy_tanks) == expected[lvl], (
|
| 192 |
+
f"{lvl}: must have exactly {expected[lvl]} enemy tanks, "
|
| 193 |
+
f"got {len(enemy_tanks)}"
|
| 194 |
)
|
| 195 |
ys = sorted(a.position[1] for a in enemy_tanks)
|
| 196 |
+
assert len(set(ys)) == expected[lvl], (
|
| 197 |
+
f"{lvl}: enemy tanks must be on {expected[lvl]} distinct "
|
| 198 |
+
f"latitudes (spread line), got ys={ys}"
|
| 199 |
)
|
| 200 |
# Verify the (50,20) silent-fail cell is NOT used.
|
| 201 |
positions = [tuple(a.position) for a in enemy_tanks]
|
| 202 |
assert (50, 20) not in positions, (
|
| 203 |
f"{lvl}: (50,20) is a CLAUDE.md-documented silent-fail "
|
| 204 |
+
f"cell. Got {positions}"
|
| 205 |
)
|
| 206 |
types = [a.type for a in c.scenario.actors if a.owner == "enemy"]
|
| 207 |
assert "fact" in types, f"{lvl}: needs a persistent enemy fact"
|
|
|
|
| 266 |
|
| 267 |
|
| 268 |
def _brute_attack_move(rs, Command):
|
| 269 |
+
"""Brute: every tank attack_moves straight onto the enemy line.
|
| 270 |
+
The `attack_move` drives the bunched column INTO the enemy
|
| 271 |
+
position (rather than holding at cannon range) — the stack is
|
| 272 |
+
enveloped in the enemy crossfire and force-wipes before clearing
|
| 273 |
+
the line ⇒ LOSS (force-wipe / kill-bar unmet)."""
|
| 274 |
own = _own_ids(rs)
|
| 275 |
if not own:
|
| 276 |
return [Command.observe()]
|
| 277 |
return [Command.attack_move(own, 51, 20)]
|
| 278 |
|
| 279 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 280 |
def _focus_fire(rs, Command):
|
| 281 |
"""Focus-fire: ALL agent tanks attack_unit the SAME target each
|
| 282 |
turn — the closest enemy to the agent centroid. Once that enemy
|
|
|
|
| 350 |
)
|
| 351 |
|
| 352 |
|
| 353 |
+
@pytest.mark.parametrize("level", ["easy", "medium", "hard"])
|
| 354 |
@pytest.mark.parametrize("seed", [1, 2, 3, 4])
|
| 355 |
+
def test_medium_outnumbered_needs_controlled_engagement(level, seed):
|
| 356 |
+
"""The medium-tier 4-vs-3 over-match is the load-bearing
|
| 357 |
+
discrimination: the intended controlled focus-fire engagement
|
| 358 |
+
clears ≥3 of the 4 enemy tanks while keeping ≥2 of its own (WIN),
|
| 359 |
+
whereas the brute `attack_move` drive-in is enveloped in the
|
| 360 |
+
4-tank crossfire and force-wipes before killing 3 (LOSS). This
|
| 361 |
+
re-asserts the focus-WIN / brute-LOSS bar across every level —
|
| 362 |
+
the per-policy tests above already cover it, this is the
|
| 363 |
+
aggregate invariant pinned by the recalibration."""
|
|
|
|
|
|
|
|
|
|
| 364 |
pytest.importorskip("openra_train")
|
| 365 |
from openra_bench.eval_core import run_level
|
| 366 |
|
| 367 |
c = compile_level(load_pack(PACK_PATH), level)
|
| 368 |
+
win = run_level(c, _focus_fire, seed=seed)
|
| 369 |
+
lose = run_level(c, _brute_attack_move, seed=seed)
|
| 370 |
+
assert win.outcome == "win", (
|
| 371 |
+
f"{level} seed={seed}: controlled focus engagement must WIN, "
|
| 372 |
+
f"got {win.outcome} (kills={win.signals.units_killed}, "
|
| 373 |
+
f"losses={win.signals.units_lost})"
|
| 374 |
+
)
|
| 375 |
+
assert lose.outcome == "loss", (
|
| 376 |
+
f"{level} seed={seed}: brute drive-in must LOSE, got "
|
| 377 |
+
f"{lose.outcome} (kills={lose.signals.units_killed}, "
|
| 378 |
+
f"losses={lose.signals.units_lost})"
|
| 379 |
)
|