Spaces:
Sleeping
Sleeping
Commit ·
984aa3b
1
Parent(s): 13517a8
docs: polish README; remove emoji
Browse files- Remove frontmatter emoji and tighten intro/overview wording
- Minor formatting cleanup
- Add beginner-facing PROJECT_COMPLETE_GUIDE.md
- PROJECT_COMPLETE_GUIDE.md +346 -0
- README.md +17 -24
PROJECT_COMPLETE_GUIDE.md
ADDED
|
@@ -0,0 +1,346 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# 911 Dispatch Project - Complete Beginner Guide
|
| 2 |
+
|
| 3 |
+
## 1. What this project is (in plain language)
|
| 4 |
+
|
| 5 |
+
This project is a simulator where an AI agent learns to behave like a city emergency dispatch supervisor.
|
| 6 |
+
|
| 7 |
+
Think of it like a strategy game:
|
| 8 |
+
- There are emergencies (incidents).
|
| 9 |
+
- There are responders (fire, police, EMS units).
|
| 10 |
+
- The agent must choose what to do each turn (dispatch, reassign, cancel, request mutual aid, etc.).
|
| 11 |
+
- The simulator gives a score for each decision and a final score for the whole run.
|
| 12 |
+
|
| 13 |
+
The goal is to train and evaluate decision-making quality under pressure.
|
| 14 |
+
|
| 15 |
+
## 2. What an RL environment means
|
| 16 |
+
|
| 17 |
+
RL means Reinforcement Learning.
|
| 18 |
+
|
| 19 |
+
In RL, four core ideas exist:
|
| 20 |
+
- Agent: the decision-maker (your model or baseline policy).
|
| 21 |
+
- Environment: the world that reacts to actions (this simulator).
|
| 22 |
+
- Reward: a number that says how good/bad the last action outcome was.
|
| 23 |
+
- Episode: one complete run from start to finish.
|
| 24 |
+
|
| 25 |
+
For this project:
|
| 26 |
+
- Agent picks an action.
|
| 27 |
+
- Environment updates city state.
|
| 28 |
+
- Environment returns:
|
| 29 |
+
- updated observation,
|
| 30 |
+
- reward,
|
| 31 |
+
- done flag (whether run is over).
|
| 32 |
+
|
| 33 |
+
That loop repeats until the episode ends.
|
| 34 |
+
|
| 35 |
+
## 3. Important clarification: "scheme of electricity" vs "city schema"
|
| 36 |
+
|
| 37 |
+
There is no electricity scheme in this codebase.
|
| 38 |
+
|
| 39 |
+
What exists is a city schema.
|
| 40 |
+
|
| 41 |
+
City schema means a configuration blueprint for the simulation:
|
| 42 |
+
- city size (grid),
|
| 43 |
+
- districts,
|
| 44 |
+
- available units,
|
| 45 |
+
- unit speeds,
|
| 46 |
+
- default recommended unit types for each incident type.
|
| 47 |
+
|
| 48 |
+
The schema is loaded from data files and used to initialize deterministic, repeatable scenarios.
|
| 49 |
+
|
| 50 |
+
## 4. Project architecture (high level)
|
| 51 |
+
|
| 52 |
+
1. Scenario/task setup
|
| 53 |
+
- A task fixture builds initial units/incidents and metadata.
|
| 54 |
+
|
| 55 |
+
2. State machine update engine
|
| 56 |
+
- Validates actions.
|
| 57 |
+
- Applies action effects.
|
| 58 |
+
- Advances time by one tick.
|
| 59 |
+
- Updates incident statuses and unit statuses.
|
| 60 |
+
|
| 61 |
+
3. Reward + scoring
|
| 62 |
+
- Computes per-step reward components.
|
| 63 |
+
- Computes episode-level score using task-specific graders.
|
| 64 |
+
|
| 65 |
+
4. API server
|
| 66 |
+
- Exposes reset/step/state endpoints.
|
| 67 |
+
|
| 68 |
+
5. Dashboard
|
| 69 |
+
- Polls backend state repeatedly and renders units/incidents + reward bars.
|
| 70 |
+
|
| 71 |
+
## 5. What is the task?
|
| 72 |
+
|
| 73 |
+
A task is a scenario type with its own initial conditions, difficulty, and final grading logic.
|
| 74 |
+
|
| 75 |
+
This project has 4 tasks:
|
| 76 |
+
|
| 77 |
+
1. single_incident (easy)
|
| 78 |
+
- One incident, small unit pool.
|
| 79 |
+
- Focus: dispatch the right unit fast.
|
| 80 |
+
|
| 81 |
+
2. multi_incident (medium)
|
| 82 |
+
- Multiple incidents at the same time.
|
| 83 |
+
- Focus: triage/prioritization and handling P1 incidents.
|
| 84 |
+
|
| 85 |
+
3. mass_casualty (hard)
|
| 86 |
+
- Incident waves with severe emergencies and resource conflicts.
|
| 87 |
+
- Focus: survival outcomes under surge.
|
| 88 |
+
|
| 89 |
+
4. shift_surge (hard)
|
| 90 |
+
- New incidents arrive over time and some units go out of service.
|
| 91 |
+
- Focus: long-horizon operations and city coverage under degradation.
|
| 92 |
+
|
| 93 |
+
## 6. What is an episode?
|
| 94 |
+
|
| 95 |
+
An episode is one full run of a task from reset until terminal condition.
|
| 96 |
+
|
| 97 |
+
Episode starts when reset is called.
|
| 98 |
+
- step_count starts at 0.
|
| 99 |
+
- city_time starts at 0 seconds.
|
| 100 |
+
- units and incidents are loaded from selected task fixture.
|
| 101 |
+
|
| 102 |
+
Episode ends when any terminal condition is hit:
|
| 103 |
+
- max steps reached,
|
| 104 |
+
- at least one incident escalates,
|
| 105 |
+
- all incidents resolved.
|
| 106 |
+
|
| 107 |
+
## 7. What is a step?
|
| 108 |
+
|
| 109 |
+
A step is one action cycle:
|
| 110 |
+
|
| 111 |
+
1. Agent sends one action.
|
| 112 |
+
2. Validator checks if action is legal.
|
| 113 |
+
3. State machine applies action effects.
|
| 114 |
+
4. Time advances by 30 seconds.
|
| 115 |
+
5. Reward is computed.
|
| 116 |
+
6. Observation + reward + done are returned.
|
| 117 |
+
|
| 118 |
+
Important:
|
| 119 |
+
- step_count increases by 1 per step.
|
| 120 |
+
- city_time increases by 30 seconds per step.
|
| 121 |
+
|
| 122 |
+
## 8. At what step are we right now?
|
| 123 |
+
|
| 124 |
+
Snapshot from the live backend at the time this guide was generated:
|
| 125 |
+
|
| 126 |
+
- task_id: multi_incident
|
| 127 |
+
- episode_id: d2cd525e-2596-44cb-bbe3-af33236264a0
|
| 128 |
+
- step_count: 8
|
| 129 |
+
- city_time: 240.0 seconds
|
| 130 |
+
- cumulative_reward: 1.6
|
| 131 |
+
- episode_score: 0.0
|
| 132 |
+
- legal_actions currently available: 36
|
| 133 |
+
|
| 134 |
+
This is a live value, not a constant. If you reset again, step_count returns to 0.
|
| 135 |
+
|
| 136 |
+
## 9. Action space (what actions exist)
|
| 137 |
+
|
| 138 |
+
Current action types include:
|
| 139 |
+
- DISPATCH
|
| 140 |
+
- CANCEL
|
| 141 |
+
- REASSIGN
|
| 142 |
+
- STAGE
|
| 143 |
+
- MUTUAL_AID
|
| 144 |
+
- UPGRADE
|
| 145 |
+
- DOWNGRADE
|
| 146 |
+
|
| 147 |
+
Legal actions are generated from current state and filtered by protocol validation, so only valid actions appear in legal_actions.
|
| 148 |
+
|
| 149 |
+
## 10. How scoring works (complete detail)
|
| 150 |
+
|
| 151 |
+
There are two scoring layers:
|
| 152 |
+
|
| 153 |
+
1. Step reward (every action)
|
| 154 |
+
2. Episode score (whole run)
|
| 155 |
+
|
| 156 |
+
### 10.1 Step reward (RewardCalculator)
|
| 157 |
+
|
| 158 |
+
Step reward uses a weighted sum of 5 components:
|
| 159 |
+
- response_time: 30%
|
| 160 |
+
- triage: 25%
|
| 161 |
+
- survival: 25%
|
| 162 |
+
- coverage: 12%
|
| 163 |
+
- protocol: 8%
|
| 164 |
+
|
| 165 |
+
Total formula:
|
| 166 |
+
- total = 0.30 * response_time + 0.25 * triage + 0.25 * survival + 0.12 * coverage + 0.08 * protocol
|
| 167 |
+
- result is clamped to [0, 1]
|
| 168 |
+
|
| 169 |
+
Safety rule:
|
| 170 |
+
- If any Priority-1 incident existed and survival component is 0, total score is capped at 0.2.
|
| 171 |
+
|
| 172 |
+
Component details:
|
| 173 |
+
|
| 174 |
+
1. response_time
|
| 175 |
+
- Only meaningful for DISPATCH.
|
| 176 |
+
- For non-DISPATCH actions it returns neutral 0.5.
|
| 177 |
+
- For DISPATCH: compares ETA to severity benchmark.
|
| 178 |
+
|
| 179 |
+
2. triage
|
| 180 |
+
- Only meaningful for DISPATCH.
|
| 181 |
+
- Checks if dispatched unit type matches required unit types for incident type.
|
| 182 |
+
- Handles enum-qualified metadata keys safely.
|
| 183 |
+
|
| 184 |
+
3. survival
|
| 185 |
+
- Based on P1 incidents seen vs resolved without failure.
|
| 186 |
+
- Uses metadata lists: p1_seen, resolved_incidents, failed_incidents.
|
| 187 |
+
|
| 188 |
+
4. coverage
|
| 189 |
+
- Measures how many districts still have AVAILABLE coverage.
|
| 190 |
+
|
| 191 |
+
5. protocol
|
| 192 |
+
- If action invalid: 0.0.
|
| 193 |
+
- If valid and no phraseology text in Action.notes: neutral 0.5.
|
| 194 |
+
- If Action.notes provided: uses PhraseologyJudge score + readback correctness.
|
| 195 |
+
|
| 196 |
+
### 10.2 Episode score (whole run)
|
| 197 |
+
|
| 198 |
+
Episode score is task-specific via a central grade_episode router.
|
| 199 |
+
|
| 200 |
+
Why this matters:
|
| 201 |
+
- Different tasks need different definitions of success.
|
| 202 |
+
- Mean step reward alone is often too weak for real evaluation.
|
| 203 |
+
|
| 204 |
+
Task-specific episode graders:
|
| 205 |
+
|
| 206 |
+
1. single_incident
|
| 207 |
+
- +0.50 if incident resolved
|
| 208 |
+
- +0.30 if MEDIC dispatched correctly
|
| 209 |
+
- +0.20 if resolved within first 10 steps
|
| 210 |
+
|
| 211 |
+
2. multi_incident
|
| 212 |
+
- Uses P1 resolution, overall resolution ratio, and escalation penalty
|
| 213 |
+
- score = 0.5 * p1_score + 0.3 * resolution_score - 0.2 * failure_penalty
|
| 214 |
+
|
| 215 |
+
3. mass_casualty
|
| 216 |
+
- Emphasizes P1 survival with penalties for failures
|
| 217 |
+
- score = 0.6 * survival_score + 0.3 * mean_reward - failure_penalty
|
| 218 |
+
|
| 219 |
+
4. shift_surge (improved)
|
| 220 |
+
- Emphasizes long-horizon operational quality:
|
| 221 |
+
- incident throughput (resolved ratio)
|
| 222 |
+
- P1 survival
|
| 223 |
+
- coverage
|
| 224 |
+
- low backlog
|
| 225 |
+
- mean reward
|
| 226 |
+
- escalation penalty
|
| 227 |
+
|
| 228 |
+
## 11. Very important score semantics
|
| 229 |
+
|
| 230 |
+
In the OpenEnv wrapper:
|
| 231 |
+
- reward return value from step is per-step reward.
|
| 232 |
+
- observation.score is overwritten to episode score.
|
| 233 |
+
|
| 234 |
+
Also stored in metadata:
|
| 235 |
+
- cumulative_reward: running sum of step rewards.
|
| 236 |
+
- episode_rewards: list of per-step rewards.
|
| 237 |
+
- episode_score: current episode-level grade.
|
| 238 |
+
|
| 239 |
+
So if you compare values:
|
| 240 |
+
- reward = immediate local quality for this action
|
| 241 |
+
- observation.score = global task progress quality for the run
|
| 242 |
+
|
| 243 |
+
## 12. Is the dashboard connected to backend or just static?
|
| 244 |
+
|
| 245 |
+
It is connected to backend.
|
| 246 |
+
|
| 247 |
+
How we know:
|
| 248 |
+
- The dashboard JavaScript calls API endpoint http://localhost:8000/dashboard/state.
|
| 249 |
+
- It polls every 500 ms.
|
| 250 |
+
- It renders live units/incidents, step, and reward breakdown from backend response.
|
| 251 |
+
|
| 252 |
+
Connection behavior:
|
| 253 |
+
- If backend is unreachable, dashboard shows disconnected status.
|
| 254 |
+
- If backend is running and reset was called, dashboard updates live as step changes.
|
| 255 |
+
|
| 256 |
+
## 13. Why we used Docker
|
| 257 |
+
|
| 258 |
+
Docker is used to package the app and dependencies so it runs consistently everywhere.
|
| 259 |
+
|
| 260 |
+
Benefits:
|
| 261 |
+
- Same runtime on your machine, CI, and deployment platforms.
|
| 262 |
+
- No "works on my machine" package mismatch issues.
|
| 263 |
+
- Easy deployment with a single container image.
|
| 264 |
+
- Port compatibility: server reads PORT environment variable (important for hosted platforms).
|
| 265 |
+
|
| 266 |
+
In this project:
|
| 267 |
+
- Root Dockerfile runs uvicorn on 0.0.0.0 and PORT (default 8000).
|
| 268 |
+
- That makes it suitable for local run and hosted environments.
|
| 269 |
+
|
| 270 |
+
## 14. What API key are we using?
|
| 271 |
+
|
| 272 |
+
The project expects environment variables. Keys are not hardcoded in repository files.
|
| 273 |
+
|
| 274 |
+
Required for LLM mode:
|
| 275 |
+
- API_BASE_URL
|
| 276 |
+
- MODEL_NAME
|
| 277 |
+
- OPENAI_API_KEY
|
| 278 |
+
|
| 279 |
+
Compatibility fallback:
|
| 280 |
+
- HF_TOKEN is accepted if OPENAI_API_KEY is not set.
|
| 281 |
+
|
| 282 |
+
No-key mode:
|
| 283 |
+
- USE_RANDOM=true bypasses LLM and uses a deterministic random baseline agent.
|
| 284 |
+
|
| 285 |
+
Practical meaning:
|
| 286 |
+
- If USE_RANDOM=true, you can run without any API key.
|
| 287 |
+
- If USE_RANDOM is not true, OPENAI_API_KEY (or HF_TOKEN fallback) is needed.
|
| 288 |
+
|
| 289 |
+
## 15. Backend API endpoints (what each does)
|
| 290 |
+
|
| 291 |
+
- GET /health
|
| 292 |
+
- health check
|
| 293 |
+
|
| 294 |
+
- GET /tasks
|
| 295 |
+
- list available tasks
|
| 296 |
+
|
| 297 |
+
- POST /reset
|
| 298 |
+
- start new episode for selected task
|
| 299 |
+
|
| 300 |
+
- POST /step
|
| 301 |
+
- apply one action and move simulation one step
|
| 302 |
+
|
| 303 |
+
- GET /state
|
| 304 |
+
- current state
|
| 305 |
+
|
| 306 |
+
- GET /dashboard/state
|
| 307 |
+
- extended state for HTML dashboard (includes legal actions + last observation)
|
| 308 |
+
|
| 309 |
+
- GET /metadata and GET /schema
|
| 310 |
+
- environment metadata and contracts
|
| 311 |
+
|
| 312 |
+
- POST /mcp
|
| 313 |
+
- minimal JSON-RPC endpoint
|
| 314 |
+
|
| 315 |
+
## 16. What the dashboard shows vs what it does not show
|
| 316 |
+
|
| 317 |
+
Shows:
|
| 318 |
+
- Unit cards (status, assignment, ETA, location)
|
| 319 |
+
- Incident cards (type, severity, status, assigned units)
|
| 320 |
+
- Map view for units/incidents
|
| 321 |
+
- Last step reward component bars
|
| 322 |
+
- Header task/episode/step values
|
| 323 |
+
|
| 324 |
+
Nuance:
|
| 325 |
+
- Header "Score" currently uses metadata.cumulative_reward.
|
| 326 |
+
- Episode score is available too (metadata.episode_score), but not currently shown as the main header score.
|
| 327 |
+
|
| 328 |
+
## 17. Beginner glossary
|
| 329 |
+
|
| 330 |
+
- incident: emergency case to be handled
|
| 331 |
+
- unit: responder vehicle/team (EMS, fire, police, etc.)
|
| 332 |
+
- legal action: an action that passes protocol checks in current state
|
| 333 |
+
- reward: immediate feedback signal for one step
|
| 334 |
+
- episode score: overall quality of a full run
|
| 335 |
+
- terminal: episode is finished
|
| 336 |
+
|
| 337 |
+
## 18. Practical "how to think" summary
|
| 338 |
+
|
| 339 |
+
When you judge behavior quality in this project:
|
| 340 |
+
- Use step rewards to understand local tactical quality.
|
| 341 |
+
- Use episode score to understand mission success for the selected task.
|
| 342 |
+
- Use dashboard to observe live state transitions.
|
| 343 |
+
- Use task definitions to interpret what success means in each scenario.
|
| 344 |
+
|
| 345 |
+
If you remember one thing:
|
| 346 |
+
- This is not a generic chatbot app. It is a decision simulator where actions change a world state over time and are graded both step-by-step and across full episodes.
|
README.md
CHANGED
|
@@ -1,35 +1,31 @@
|
|
| 1 |
---
|
| 2 |
title: 911 Dispatch Supervisor
|
| 3 |
-
emoji: 🚨
|
| 4 |
colorFrom: red
|
| 5 |
colorTo: orange
|
| 6 |
sdk: docker
|
| 7 |
pinned: false
|
| 8 |
tags:
|
| 9 |
-
|
| 10 |
- openenv
|
| 11 |
- reinforcement-learning
|
| 12 |
- llm-agent
|
| 13 |
- emergency-dispatch
|
| 14 |
---
|
| 15 |
|
| 16 |
-
# 911
|
| 17 |
-
|
| 18 |
-
**LLM-powered 911 dispatch supervision — city scale**
|
| 19 |
|
| 20 |
-
|
| 21 |
|
| 22 |
## Overview
|
| 23 |
|
| 24 |
-
This
|
| 25 |
|
| 26 |
-
- **Dispatch lifecycle**: incidents
|
| 27 |
-
- **Deterministic simulation**:
|
| 28 |
-
- **Protocol validator**:
|
| 29 |
-
- **OpenEnv
|
| 30 |
-
- **
|
| 31 |
|
| 32 |
-
## Visualizer
|
| 33 |
|
| 34 |
The 2D visualizer is in `src/visualizer/viewer.py` and renders the current state to a PNG.
|
| 35 |
|
|
@@ -41,10 +37,10 @@ from src.openenv_environment import OpenEnvEnvironment
|
|
| 41 |
from src.visualizer.viewer import Viewer2D
|
| 42 |
|
| 43 |
async def main():
|
| 44 |
-
|
| 45 |
-
|
| 46 |
-
|
| 47 |
-
|
| 48 |
|
| 49 |
asyncio.run(main())
|
| 50 |
```
|
|
@@ -194,8 +190,7 @@ The reward signal is a weighted combination of five components:
|
|
| 194 |
| `coverage` | 12% | Geographic distribution of available units across city districts |
|
| 195 |
| `protocol` | 8% | Action legality + dispatch phraseology/readback quality (via `Action.notes`) |
|
| 196 |
|
| 197 |
-
|
| 198 |
-
|
| 199 |
|
| 200 |
|
| 201 |
## Project Structure
|
|
@@ -265,13 +260,11 @@ curl -X POST http://localhost:8000/reset -H "Content-Type: application/json" -d
|
|
| 265 |
| `/dashboard/state` | GET | Extended state for `live_dashboard.html` |
|
| 266 |
| `/tasks` | GET | List all available tasks with metadata |
|
| 267 |
|
| 268 |
-
##
|
| 269 |
-
|
| 270 |
-
### Deploying to Hugging Face Spaces (Docker)
|
| 271 |
|
| 272 |
-
|
| 273 |
|
| 274 |
-
1) Create a new Space
|
| 275 |
2) Push this repository to the Space.
|
| 276 |
3) The server binds to the `PORT` environment variable (HF commonly sets `PORT=7860`).
|
| 277 |
|
|
|
|
| 1 |
---
|
| 2 |
title: 911 Dispatch Supervisor
|
|
|
|
| 3 |
colorFrom: red
|
| 4 |
colorTo: orange
|
| 5 |
sdk: docker
|
| 6 |
pinned: false
|
| 7 |
tags:
|
|
|
|
| 8 |
- openenv
|
| 9 |
- reinforcement-learning
|
| 10 |
- llm-agent
|
| 11 |
- emergency-dispatch
|
| 12 |
---
|
| 13 |
|
| 14 |
+
# 911 Dispatch Supervisor
|
|
|
|
|
|
|
| 15 |
|
| 16 |
+
Deterministic simulator + RL-style environment for city-wide 911 dispatch. It supports police/fire/EMS unit allocation across concurrent incidents, with an OpenEnv-compatible interface and a small FastAPI server for interactive runs and the live dashboard.
|
| 17 |
|
| 18 |
## Overview
|
| 19 |
|
| 20 |
+
This repo is meant for training and evaluating agents (LLM-based or scripted baselines) as dispatch supervisors. It includes:
|
| 21 |
|
| 22 |
+
- **Dispatch lifecycle**: incidents progress from pending to resolved (or escalated)
|
| 23 |
+
- **Deterministic simulation**: reproducible episodes under fixed seeds
|
| 24 |
+
- **Protocol validator**: checks whether an action is legal in the current state
|
| 25 |
+
- **OpenEnv-compatible**: standard `reset` / `step` loop
|
| 26 |
+
- **2D visualization**: render a PNG snapshot of the current state
|
| 27 |
|
| 28 |
+
## Visualizer
|
| 29 |
|
| 30 |
The 2D visualizer is in `src/visualizer/viewer.py` and renders the current state to a PNG.
|
| 31 |
|
|
|
|
| 37 |
from src.visualizer.viewer import Viewer2D
|
| 38 |
|
| 39 |
async def main():
|
| 40 |
+
env = OpenEnvEnvironment(task_id="multi_incident", seed=42)
|
| 41 |
+
await env.reset()
|
| 42 |
+
Viewer2D().render_to_file("frame.png", env.state())
|
| 43 |
+
env.close()
|
| 44 |
|
| 45 |
asyncio.run(main())
|
| 46 |
```
|
|
|
|
| 190 |
| `coverage` | 12% | Geographic distribution of available units across city districts |
|
| 191 |
| `protocol` | 8% | Action legality + dispatch phraseology/readback quality (via `Action.notes`) |
|
| 192 |
|
| 193 |
+
Safety gate: if any Priority-1 incident was seen and `survival=0.0`, the total episode score is capped at `0.2` regardless of other components.
|
|
|
|
| 194 |
|
| 195 |
|
| 196 |
## Project Structure
|
|
|
|
| 260 |
| `/dashboard/state` | GET | Extended state for `live_dashboard.html` |
|
| 261 |
| `/tasks` | GET | List all available tasks with metadata |
|
| 262 |
|
| 263 |
+
## Hugging Face Spaces
|
|
|
|
|
|
|
| 264 |
|
| 265 |
+
### Deploying to Spaces (Docker)
|
| 266 |
|
| 267 |
+
1) Create a new Space and choose **Docker**.
|
| 268 |
2) Push this repository to the Space.
|
| 269 |
3) The server binds to the `PORT` environment variable (HF commonly sets `PORT=7860`).
|
| 270 |
|