File size: 2,748 Bytes
0b2675d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
# Task Reference β€” 911 Dispatch Supervisor

## Task 1: `single_incident` β€” Easy

**Objective:** Dispatch the correct unit to a single cardiac arrest and resolve it
before the survival clock expires.

**Initial State:** 1 incident (cardiac arrest, P1), 3 units available (1 MEDIC, 1 ENGINE, 1 PATROL)
**Max Steps:** 20 | **Survival Clock:** 600s

**What a good agent does:** Immediately dispatches MEDIC to the cardiac arrest.
Does not dispatch ENGINE or PATROL (triage mismatch penalty).

**What a bad agent does:** Dispatches ENGINE (wrong unit type), wastes steps,
patient survival clock expires β†’ Safety Gate β†’ score capped at 0.2.

**Scoring:** 50% resolution + 30% correct unit type + 20% response speed

---

## Task 2: `multi_incident` β€” Medium

**Objective:** Triage 3 simultaneous incidents with competing priorities.

**Initial State:** 3 incidents (structure fire P2, cardiac arrest P1, shooting P1),
6 units available
**Max Steps:** 40

**What a good agent does:** Immediately dispatches MEDIC to cardiac arrest and
PATROL to shooting (both P1), then dispatches ENGINE to structure fire (P2).

**What a bad agent does:** Dispatches to the fire first (visible/dramatic but P2),
leaving P1 incidents unattended β†’ Safety Gate.

**Scoring:** 50% P1 resolution + 30% overall resolution βˆ’ 20% escalation penalty

---

## Task 3: `mass_casualty` β€” Hard

**Objective:** Manage a building collapse with surprise incident waves.

**Initial State:** 1 incident (building collapse P1, survival 480s), 7 units
**Max Steps:** 60
**Wave spawns:** Step 5 β†’ structure fire; Step 12 β†’ 2Γ— cardiac arrests

**What a good agent does:** Responds to building collapse immediately, pre-stages
units for anticipated waves, adapts when cardiac arrests spawn at step 12.

**What a bad agent does:** Commits all units to building collapse, has no
available units when cardiac arrests spawn β†’ multiple P1 failures β†’ Safety Gate.

**Scoring:** 60% P1 survival + 30% mean step reward βˆ’ failure penalty

---

## Task 4: `shift_surge` β€” Hard

**Objective:** Maintain coverage as units go out of service mid-shift.

**Initial State:** 5 units, 0 incidents (board starts empty)
**Max Steps:** 60 | **Wave spawn:** Every 8 steps | **Survival clock:** 720s
**OOS events:** 3 units go OUT_OF_SERVICE by step 5

**What a good agent does:** Anticipates resource scarcity, requests mutual aid
early, stages remaining units strategically, prioritizes P1 incidents as board fills.

**What a bad agent does:** Dispatches all units freely in early steps, has no
coverage when OOS events hit and new incidents spawn simultaneously.

**Scoring:** 35% resolution + 25% P1 survival + 15% coverage + 15% backlog +
10% step reward βˆ’ 25% escalation penalty