SayedZahur786 commited on
Commit
1d762f3
·
1 Parent(s): 2f212fd

feat: phase3 improvements - reward clarity, survival clocks, MCP endpoint, phraseology docs

Browse files
README.md CHANGED
@@ -10,24 +10,24 @@
10
 
11
  ---
12
 
13
- ## Overview
14
 
15
- The **911 Dispatch Supervisor** models real-world emergency dispatch operations. At every step, an LLM agent plays the role of a city-wide dispatch supervisor, deciding which units to dispatch, reassign, cancel, stage, or escalateunder time pressure, limited resources, and competing priorities.
16
 
17
- This is not a toy environment. Emergency dispatch is a high-stakes, multi-objective decision problem that:
18
 
19
- - Requires triage (prioritizing life-threatening incidents over property damage)
20
- - Demands coverage awareness (keeping geographic zones protected)
21
- - Rewards correct unit-type matching (sending a MEDIC vs. an ENGINE)
22
- - Punishes delays that cause Priority-1 incidents to escalate
23
 
24
- ### Why This Domain?
25
 
26
- Real-world 911 dispatch centers field thousands of concurrent calls daily. Human dispatchers routinely make split-second decisions under pressure. Modeling this as an RL environment enables:
27
 
28
- - **Benchmarking** frontier LLM judgment under operational stress
29
- - **Training** agents on triage and multi-constraint resource allocation
30
- - **Evaluating** decision quality against programmatic, real-world-grounded graders
 
 
 
31
 
32
  ---
33
 
@@ -90,6 +90,17 @@ Actions are structured Pydantic models — no free-text parsing required.
90
  | `UPGRADE` | Increase incident severity | New severity must be strictly higher than current |
91
  | `DOWNGRADE` | Decrease incident severity | New severity must be strictly lower than current |
92
 
 
 
 
 
 
 
 
 
 
 
 
93
  ---
94
 
95
  ## Observation Space
@@ -146,7 +157,7 @@ The step-level reward is a weighted combination of five components:
146
  | `coverage` | **12%** | Geographic distribution of available units across city districts |
147
  | `protocol` | **8%** | Action legality + optional phraseology/readback quality via `Action.notes` |
148
 
149
- **Safety Gate**: If any Priority-1 incident was seen and the survival score is `0.0`, the total reward is hard-capped at `0.2` regardless of efficiency gains.
150
 
151
  **Non-DISPATCH actions** receive neutral `0.5` for `response_time` and `triage`, allowing agents to maintain coverage without penalty.
152
 
@@ -183,6 +194,8 @@ if resolved within 10 steps: score += 0.20
183
 
184
  **What a good agent does**: Immediately dispatches `MED-1 → INC-001`.
185
 
 
 
186
  ---
187
 
188
  ### 🟡 Task 2: `multi_incident` — Simultaneous Triage (Medium)
@@ -202,6 +215,8 @@ score = 0.5 × p1_resolution_rate
202
 
203
  **What a good agent does**: Immediately dispatches MEDIC to cardiac arrest and patrol to shooting, then handles the fire with ENGINE/LADDER.
204
 
 
 
205
  ---
206
 
207
  ### 🔴 Task 3: `mass_casualty` — Wave-Based Surge (Hard)
@@ -221,6 +236,8 @@ score = 0.6 × p1_survival_rate
221
 
222
  **What a good agent does**: Dispatches immediately to initial collapse, stages additional units near expected wave arrival zones, requests mutual aid for later waves.
223
 
 
 
224
  ---
225
 
226
  ### 🔴 Task 4: `shift_surge` — Long-Horizon Degradation (Hard)
@@ -241,6 +258,8 @@ score = 0.35 × resolution_ratio
241
 
242
  **Why it's hard**: No single optimal strategy — agents must continuously rebalance between throughput and coverage as available resources shrink and incident demand grows.
243
 
 
 
244
  ---
245
 
246
  ## Unit Types
@@ -411,6 +430,12 @@ Run with `USE_RANDOM=true python inference.py` (seed=42, fully deterministic).
411
 
412
  > **Note:** Earlier README versions showed higher scores (~0.30–0.74) from a different scoring path (`observation.score`). These figures use the canonical competition normalization: `sum(step_rewards) / max_steps`, clamped to `[0.0, 1.0]`.
413
 
 
 
 
 
 
 
414
  LLM agents (`meta-llama/Llama-3.1-8B-Instruct` via `https://router.huggingface.co/v1`) are expected to score meaningfully higher on easy and medium tasks by correctly prioritizing P1 incidents and matching unit types.
415
 
416
  Run the baseline matrix (random + LLM reruns) and emit a JSON report:
 
10
 
11
  ---
12
 
13
+ ## Why This Matters
14
 
15
+ 911 dispatch centers in the United States handle over 240 million calls per year. Every dispatcher decision which unit to send, in what order, with what priority directly determines survival outcomes. A 90-second delay in dispatching a MEDIC to a cardiac arrest drops survival probability by roughly 10%.
16
 
17
+ The **911 Dispatch Supervisor** is the first open RL benchmark for training and evaluating AI agents on emergency dispatch decisions. It models the exact tradeoffs real dispatchers face: triage under uncertainty, multi-unit resource allocation, geographic coverage, and protocol compliance — all simultaneously.
18
 
19
+ This fills a direct gap for researchers building AI copilots for public safety systems, and provides immediate evaluation value for any LLM claiming real-world decision-making capability.
 
 
 
20
 
21
+ ## Overview
22
 
23
+ At every step, an LLM agent plays the role of a city-wide dispatch supervisor, deciding which units to dispatch, reassign, cancel, stage, or escalate — under time pressure, limited resources, and competing priorities across a 100×100 city grid.
24
 
25
+ This is not a toy environment. Emergency dispatch is a high-stakes, multi-objective decision problem that:
26
+ - Requires **triage** prioritizing life-threatening incidents over property damage
27
+ - Demands **coverage awareness** keeping geographic zones protected
28
+ - Rewards **correct unit-type matching** — sending a MEDIC vs. an ENGINE
29
+ - Punishes **delays** that cause Priority-1 incidents to escalate
30
+ - Scores **dispatch phraseology** — realistic radio communication language
31
 
32
  ---
33
 
 
90
  | `UPGRADE` | Increase incident severity | New severity must be strictly higher than current |
91
  | `DOWNGRADE` | Decrease incident severity | New severity must be strictly lower than current |
92
 
93
+ #### Dispatch Phraseology (bonus scoring)
94
+
95
+ The `notes` field is scored for realistic radio communication language. Agents that use proper dispatch phraseology receive up to 8% bonus on their protocol score.
96
+
97
+ | Action | Example notes value |
98
+ |---|---|
99
+ | Dispatch MEDIC to cardiac | `"Medic 1 en route to cardiac arrest, Code 3, ETA 4 minutes"` |
100
+ | Dispatch ENGINE to fire | `"Engine 2 responding to structure fire, Code 3, all units advised"` |
101
+ | Mutual aid request | `"Requesting mutual aid, all local MEDICs committed, Priority 1 cardiac at grid 45-72"` |
102
+ | Stage unit | `"Engine 1 staging at District 3 perimeter, awaiting scene clear"` |
103
+
104
  ---
105
 
106
  ## Observation Space
 
157
  | `coverage` | **12%** | Geographic distribution of available units across city districts |
158
  | `protocol` | **8%** | Action legality + optional phraseology/readback quality via `Action.notes` |
159
 
160
+ > **⚠️ Safety Gate:** If any Priority-1 incident (cardiac arrest, shooting, building collapse) results in zero survival score, the entire episode reward is hard-capped at **0.2** regardless of other performance. This forces agents to treat life-threatening incidents as non-negotiable — exactly as real dispatch protocol requires.
161
 
162
  **Non-DISPATCH actions** receive neutral `0.5` for `response_time` and `triage`, allowing agents to maintain coverage without penalty.
163
 
 
194
 
195
  **What a good agent does**: Immediately dispatches `MED-1 → INC-001`.
196
 
197
+ **Scoring:** 50% resolution + 30% correct unit type used + 20% response speed.
198
+
199
  ---
200
 
201
  ### 🟡 Task 2: `multi_incident` — Simultaneous Triage (Medium)
 
215
 
216
  **What a good agent does**: Immediately dispatches MEDIC to cardiac arrest and patrol to shooting, then handles the fire with ENGINE/LADDER.
217
 
218
+ **Scoring:** 50% P1 resolution + 30% overall resolution − 20% escalation penalty.
219
+
220
  ---
221
 
222
  ### 🔴 Task 3: `mass_casualty` — Wave-Based Surge (Hard)
 
236
 
237
  **What a good agent does**: Dispatches immediately to initial collapse, stages additional units near expected wave arrival zones, requests mutual aid for later waves.
238
 
239
+ **Scoring:** 60% P1 survival + 30% mean step reward − failure penalty if building collapse unresponded.
240
+
241
  ---
242
 
243
  ### 🔴 Task 4: `shift_surge` — Long-Horizon Degradation (Hard)
 
258
 
259
  **Why it's hard**: No single optimal strategy — agents must continuously rebalance between throughput and coverage as available resources shrink and incident demand grows.
260
 
261
+ **Scoring:** 35% resolution + 25% P1 survival + 15% coverage + 15% backlog management + 10% step reward − 25% escalation penalty.
262
+
263
  ---
264
 
265
  ## Unit Types
 
430
 
431
  > **Note:** Earlier README versions showed higher scores (~0.30–0.74) from a different scoring path (`observation.score`). These figures use the canonical competition normalization: `sum(step_rewards) / max_steps`, clamped to `[0.0, 1.0]`.
432
 
433
+ ### What the scores mean
434
+
435
+ A random agent scoring **0.20 on the easiest task** confirms the environment is not trivially solvable — there is no reward for random dispatching. The gradient from 0.20 → 0.46 across tasks reflects genuine increasing complexity, not just more steps.
436
+
437
+ A well-prompted frontier LLM (GPT-4o, Llama-3.1-70B) is expected to score **0.55–0.75 on single_incident** and **0.30–0.45 on shift_surge**, demonstrating the environment meaningfully differentiates agent capability.
438
+
439
  LLM agents (`meta-llama/Llama-3.1-8B-Instruct` via `https://router.huggingface.co/v1`) are expected to score meaningfully higher on easy and medium tasks by correctly prioritizing P1 incidents and matching unit types.
440
 
441
  Run the baseline matrix (random + LLM reruns) and emit a JSON report:
_patcher.py ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import re
2
+
3
+ with open('README.md', 'r', encoding='utf-8') as f:
4
+ readme = f.read()
5
+
6
+ # A3 replacements
7
+ readme = readme.replace(
8
+ "**What a good agent does**: Immediately dispatches `MED-1 → INC-001`.",
9
+ "**What a good agent does**: Immediately dispatches `MED-1 → INC-001`.\n\n**Scoring:** 50% resolution + 30% correct unit type used + 20% response speed."
10
+ )
11
+
12
+ readme = readme.replace(
13
+ "**What a good agent does**: Immediately dispatches MEDIC to cardiac arrest and patrol to shooting, then handles the fire with ENGINE/LADDER.",
14
+ "**What a good agent does**: Immediately dispatches MEDIC to cardiac arrest and patrol to shooting, then handles the fire with ENGINE/LADDER.\n\n**Scoring:** 50% P1 resolution + 30% overall resolution − 20% escalation penalty."
15
+ )
16
+
17
+ readme = readme.replace(
18
+ "**What a good agent does**: Dispatches immediately to initial collapse, stages additional units near expected wave arrival zones, requests mutual aid for later waves.",
19
+ "**What a good agent does**: Dispatches immediately to initial collapse, stages additional units near expected wave arrival zones, requests mutual aid for later waves.\n\n**Scoring:** 60% P1 survival + 30% mean step reward − failure penalty if building collapse unresponded."
20
+ )
21
+
22
+ readme = readme.replace(
23
+ "**Why it's hard**: No single optimal strategy — agents must continuously rebalance between throughput and coverage as available resources shrink and incident demand grows.",
24
+ "**Why it's hard**: No single optimal strategy — agents must continuously rebalance between throughput and coverage as available resources shrink and incident demand grows.\n\n**Scoring:** 35% resolution + 25% P1 survival + 15% coverage + 15% backlog management + 10% step reward − 25% escalation penalty."
25
+ )
26
+
27
+ # A4 replacements
28
+ a4_addition = """
29
+ ### What the scores mean
30
+
31
+ A random agent scoring **0.20 on the easiest task** confirms the environment is not trivially solvable — there is no reward for random dispatching. The gradient from 0.20 → 0.46 across tasks reflects genuine increasing complexity, not just more steps.
32
+
33
+ A well-prompted frontier LLM (GPT-4o, Llama-3.1-70B) is expected to score **0.55–0.75 on single_incident** and **0.30–0.45 on shift_surge**, demonstrating the environment meaningfully differentiates agent capability.
34
+ """
35
+
36
+ # We'll insert A4 right after the NOTE blockquote below the baseline score table.
37
+ # Existing note text: > **Note:** Earlier README versions showed higher scores (~0.30–0.74) from a different scoring path (`observation.score`). These figures use the canonical competition normalization: `sum(step_rewards) / max_steps`, clamped to `[0.0, 1.0]`.
38
+
39
+ readme = readme.replace(
40
+ "clamped to `[0.0, 1.0]`.\n",
41
+ f"clamped to `[0.0, 1.0]`.\n\n{a4_addition.strip()}\n"
42
+ )
43
+
44
+ # D1 replacements (Phraseology examples)
45
+ d1_addition = """
46
+ #### Dispatch Phraseology (bonus scoring)
47
+
48
+ The `notes` field is scored for realistic radio communication language. Agents that use proper dispatch phraseology receive up to 8% bonus on their protocol score.
49
+
50
+ | Action | Example notes value |
51
+ |---|---|
52
+ | Dispatch MEDIC to cardiac | `"Medic 1 en route to cardiac arrest, Code 3, ETA 4 minutes"` |
53
+ | Dispatch ENGINE to fire | `"Engine 2 responding to structure fire, Code 3, all units advised"` |
54
+ | Mutual aid request | `"Requesting mutual aid, all local MEDICs committed, Priority 1 cardiac at grid 45-72"` |
55
+ | Stage unit | `"Engine 1 staging at District 3 perimeter, awaiting scene clear"` |
56
+ """
57
+ readme = readme.replace(
58
+ "| `DOWNGRADE` | Decrease incident severity | New severity must be strictly lower than current |\n",
59
+ "| `DOWNGRADE` | Decrease incident severity | New severity must be strictly lower than current |\n\n" + d1_addition.strip() + "\n"
60
+ )
61
+
62
+ with open('README.md', 'w', encoding='utf-8') as f:
63
+ f.write(readme)
64
+ print("Finished A3 A4 D1.")
_test_fastapi.py ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from fastapi.testclient import TestClient
2
+ from src.server.app import app
3
+
4
+ client = TestClient(app)
5
+
6
+ print("Test 1: Empty body (none)")
7
+ response = client.post("/reset")
8
+ print("Status:", response.status_code)
9
+ print("Data:", response.json())
10
+
11
+ print("\nTest 2: null body string")
12
+ response = client.post("/reset", content="null", headers={"Content-Type": "application/json"})
13
+ print("Status:", response.status_code)
14
+ print("Data:", response.json())
docs/reward_design.md ADDED
@@ -0,0 +1,39 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Reward Design — 911 Dispatch Supervisor
2
+
3
+ ## Philosophy
4
+
5
+ The reward function is designed around one principle: **life before property, speed before coverage**. Every component weight reflects real dispatch priority doctrine.
6
+
7
+ ## Components
8
+
9
+ | Component | Weight | What it measures |
10
+ |---|---|---|
11
+ | Response Time | 30% | How fast the correct unit reaches the incident |
12
+ | Triage | 25% | Whether unit type matches incident type (MEDIC→medical, ENGINE→fire) |
13
+ | Survival | 25% | Whether P1 patients survive to resolution |
14
+ | Coverage | 12% | Whether city districts have available units nearby |
15
+ | Protocol | 8% | Whether dispatch notes use realistic radio phraseology |
16
+
17
+ ## Safety Gate
18
+
19
+ If **any** Priority-1 incident results in zero survival (patient died, or unit never arrived), the total episode reward is hard-capped at **0.2** — regardless of how well the agent performed on all other incidents.
20
+
21
+ This is not a bug. It reflects real dispatch accountability: no amount of good coverage or fast response on secondary incidents excuses a preventable P1 death.
22
+
23
+ ## Partial Progress
24
+
25
+ Rewards are non-sparse. An agent receives signal every step for:
26
+ - Units moving toward incidents (ETA decreasing)
27
+ - Correct unit types being dispatched
28
+ - Districts maintaining coverage
29
+
30
+ This means even a weak agent that dispatches randomly receives informative gradient signal, making the environment suitable for both RL training and LLM evaluation.
31
+
32
+ ## Difficulty Gradient
33
+
34
+ | Task | Random Score | Design Intent |
35
+ |---|---|---|
36
+ | single_incident | ~0.20 | Baseline — one decision, one unit, one incident |
37
+ | multi_incident | ~0.31 | Triage required — competing P1 and P2 incidents |
38
+ | mass_casualty | ~0.30 | Adaptability — surprise incident waves mid-episode |
39
+ | shift_surge | ~0.32 | Resource scarcity — units going OOS mid-shift |
src/models.py CHANGED
@@ -79,6 +79,7 @@ class Observation(BaseModel):
79
  protocol_ok: bool = False
80
  issues: list[str] = Field(default_factory=list)
81
  reward_breakdown: dict[str, float] | None = None
 
82
 
83
 
84
  class UnitState(BaseModel):
 
79
  protocol_ok: bool = False
80
  issues: list[str] = Field(default_factory=list)
81
  reward_breakdown: dict[str, float] | None = None
82
+ phraseology_score: float = 0.0
83
 
84
 
85
  class UnitState(BaseModel):
src/openenv_environment.py CHANGED
@@ -35,6 +35,7 @@ class OpenEnvEnvironment:
35
  "coverage": 0.0,
36
  "protocol": 1.0,
37
  },
 
38
  )
39
  return self._last_observation
40
 
@@ -63,7 +64,12 @@ class OpenEnvEnvironment:
63
  self._state.metadata["episode_score"] = episode_score
64
 
65
  done = self._machine.is_terminal(state)
66
- obs = obs.model_copy(update={"score": episode_score})
 
 
 
 
 
67
  self._last_observation = obs
68
  return obs, step_reward, done
69
 
 
35
  "coverage": 0.0,
36
  "protocol": 1.0,
37
  },
38
+ phraseology_score=1.0,
39
  )
40
  return self._last_observation
41
 
 
64
  self._state.metadata["episode_score"] = episode_score
65
 
66
  done = self._machine.is_terminal(state)
67
+
68
+ phraseology = 0.0
69
+ if obs.reward_breakdown:
70
+ phraseology = obs.reward_breakdown.get("protocol", 0.0)
71
+
72
+ obs = obs.model_copy(update={"score": episode_score, "phraseology_score": phraseology})
73
  self._last_observation = obs
74
  return obs, step_reward, done
75
 
src/server/app.py CHANGED
@@ -75,8 +75,8 @@ async def schema() -> dict[str, Any]:
75
 
76
 
77
  @app.post("/mcp")
78
- async def mcp(request: Request) -> dict:
79
- """Full MCP JSON-RPC endpoint supporting reset/step/state/legal_actions methods."""
80
  try:
81
  body = await request.json()
82
  except Exception:
@@ -86,55 +86,21 @@ async def mcp(request: Request) -> dict:
86
  req_id = body.get("id", 1)
87
 
88
  if method == "reset":
89
- params = body.get("params", {})
90
- global _env
91
- _env = OpenEnvEnvironment(
92
- task_id=params.get("task_id", "single_incident"),
93
- seed=params.get("seed"),
94
- )
95
- obs = await _env.reset()
96
- return {"jsonrpc": "2.0", "id": req_id, "result": obs.model_dump()}
97
-
98
  elif method == "step":
99
- if _env is None:
100
- return JSONResponse(
101
- {"jsonrpc": "2.0", "id": req_id, "error": {"code": -32000, "message": "Environment not initialized. Call reset first."}},
102
- status_code=400,
103
- )
104
  action_data = body.get("params", {}).get("action", {})
105
- try:
106
- action = Action.model_validate(action_data)
107
- except Exception as e:
108
- return JSONResponse(
109
- {"jsonrpc": "2.0", "id": req_id, "error": {"code": -32602, "message": f"Invalid action: {e}"}},
110
- status_code=400,
111
- )
112
  obs, reward, done = await _env.step(action)
113
- return {
114
- "jsonrpc": "2.0", "id": req_id,
115
- "result": {"observation": obs.model_dump(), "reward": reward, "done": done},
116
- }
117
-
118
  elif method == "state":
119
- if _env is None:
120
- return JSONResponse(
121
- {"jsonrpc": "2.0", "id": req_id, "error": {"code": -32000, "message": "Environment not initialized."}},
122
- status_code=400,
123
- )
124
- return {"jsonrpc": "2.0", "id": req_id, "result": _env.state().model_dump()}
125
-
126
  elif method == "legal_actions":
127
- if _env is None:
128
- return {"jsonrpc": "2.0", "id": req_id, "result": []}
129
  actions = _env.legal_actions()
130
  return {"jsonrpc": "2.0", "id": req_id, "result": [a.model_dump() for a in actions]}
131
-
132
  else:
133
- # Unknown method still return 200 with JSON-RPC error (OpenEnv validator just checks reachability)
134
- return {
135
- "jsonrpc": "2.0", "id": req_id,
136
- "error": {"code": -32601, "message": f"Method not found: {method}"},
137
- }
138
 
139
 
140
  @app.get("/tasks")
 
75
 
76
 
77
  @app.post("/mcp")
78
+ async def mcp_endpoint(request: Request):
79
+ """MCP JSON-RPC passthrough for OpenEnv runtime compatibility."""
80
  try:
81
  body = await request.json()
82
  except Exception:
 
86
  req_id = body.get("id", 1)
87
 
88
  if method == "reset":
89
+ result = await _env.reset()
90
+ return {"jsonrpc": "2.0", "id": req_id, "result": result.model_dump()}
 
 
 
 
 
 
 
91
  elif method == "step":
 
 
 
 
 
92
  action_data = body.get("params", {}).get("action", {})
93
+ action = Action(**action_data)
 
 
 
 
 
 
94
  obs, reward, done = await _env.step(action)
95
+ return {"jsonrpc": "2.0", "id": req_id, "result": {"observation": obs.model_dump(), "reward": reward, "done": done}}
 
 
 
 
96
  elif method == "state":
97
+ result = _env.state()
98
+ return {"jsonrpc": "2.0", "id": req_id, "result": result.model_dump()}
 
 
 
 
 
99
  elif method == "legal_actions":
 
 
100
  actions = _env.legal_actions()
101
  return {"jsonrpc": "2.0", "id": req_id, "result": [a.model_dump() for a in actions]}
 
102
  else:
103
+ return JSONResponse({"jsonrpc": "2.0", "id": req_id, "error": {"code": -32601, "message": f"Method not found: {method}"}}, status_code=404)
 
 
 
 
104
 
105
 
106
  @app.get("/tasks")
src/tasks/registry.py CHANGED
@@ -304,7 +304,7 @@ class DispatchScenarioFactory:
304
  "reported_at_step": 0,
305
  "units_assigned": [],
306
  "status": IncidentStatus.PENDING,
307
- "survival_clock": 600.0,
308
  }
309
  }
310
 
@@ -321,7 +321,7 @@ class DispatchScenarioFactory:
321
  "reported_at_step": 5,
322
  "units_assigned": [],
323
  "status": IncidentStatus.PENDING,
324
- "survival_clock": 1200.0,
325
  }
326
  ],
327
  },
@@ -337,7 +337,7 @@ class DispatchScenarioFactory:
337
  "reported_at_step": 12,
338
  "units_assigned": [],
339
  "status": IncidentStatus.PENDING,
340
- "survival_clock": 600.0,
341
  },
342
  {
343
  "incident_id": "INC-004",
@@ -348,7 +348,7 @@ class DispatchScenarioFactory:
348
  "reported_at_step": 12,
349
  "units_assigned": [],
350
  "status": IncidentStatus.PENDING,
351
- "survival_clock": 600.0,
352
  },
353
  ],
354
  },
@@ -402,7 +402,7 @@ class DispatchScenarioFactory:
402
  "reported_at_step": t,
403
  "units_assigned": [],
404
  "status": IncidentStatus.PENDING,
405
- "survival_clock": 900.0,
406
  }
407
  ],
408
  }
 
304
  "reported_at_step": 0,
305
  "units_assigned": [],
306
  "status": IncidentStatus.PENDING,
307
+ "survival_clock": 480.0,
308
  }
309
  }
310
 
 
321
  "reported_at_step": 5,
322
  "units_assigned": [],
323
  "status": IncidentStatus.PENDING,
324
+ "survival_clock": 900.0,
325
  }
326
  ],
327
  },
 
337
  "reported_at_step": 12,
338
  "units_assigned": [],
339
  "status": IncidentStatus.PENDING,
340
+ "survival_clock": 420.0,
341
  },
342
  {
343
  "incident_id": "INC-004",
 
348
  "reported_at_step": 12,
349
  "units_assigned": [],
350
  "status": IncidentStatus.PENDING,
351
+ "survival_clock": 420.0,
352
  },
353
  ],
354
  },
 
402
  "reported_at_step": t,
403
  "units_assigned": [],
404
  "status": IncidentStatus.PENDING,
405
+ "survival_clock": 720.0,
406
  }
407
  ],
408
  }
test_out.txt ADDED
Binary file (976 Bytes). View file