Spaces:
Sleeping
Sleeping
File size: 12,188 Bytes
f65d331 fa0944a f65d331 538acd4 f65d331 538acd4 f65d331 538acd4 f65d331 538acd4 f65d331 538acd4 f65d331 538acd4 f65d331 538acd4 f65d331 538acd4 f65d331 538acd4 f65d331 538acd4 f65d331 538acd4 f65d331 538acd4 f65d331 538acd4 f65d331 538acd4 f65d331 538acd4 f65d331 538acd4 f65d331 538acd4 f65d331 538acd4 f65d331 538acd4 f65d331 538acd4 f65d331 538acd4 f65d331 538acd4 f65d331 538acd4 f65d331 538acd4 f65d331 538acd4 f65d331 538acd4 f65d331 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 | ---
title: DispatchPulse
emoji: π
colorFrom: red
colorTo: blue
sdk: docker
app_port: 8000
pinned: false
license: apache-2.0
---
# DispatchPulse
**An OpenEnv environment where an AI agent acts as a 911 emergency dispatch coordinator.**
The agent receives incoming calls, classifies their severity, and dispatches limited
emergency units (ALS / BLS ambulances, fire engines, police) under time pressure.
Patient outcomes are scored against **real clinical survival curves** β no
LLM-as-judge, just defensible math.
> Submission for the [Meta PyTorch OpenEnv Hackathon β India 2026](https://www.scaler.com/school-of-technology/meta-pytorch-hackathon).
---
## Why this environment
In India, an estimated 24,000+ people die every day because of slow emergency
response β average ambulance time is 25β35 minutes, well beyond the golden hour,
and only ~20% of ambulances carry advanced life support. DispatchPulse simulates
this crisis as an interactive RL environment where the agent has to learn the
*counter-intuitive* strategies real dispatchers use:
- **The greedy "closest unit" strategy fails.** Dispatching the only ALS to a
sprained ankle leaves nothing for the cardiac arrest that arrives 3 minutes
later β survival drops from 70% to 15%.
- **Triage matters more than speed.** A weighted reward (severity 1 calls
count 3Γ more than severity 4) means the agent has to *prioritise*, not
just react.
- **Hospital choice matters.** Sending a stroke patient to a hospital without
a stroke unit, or to one on diversion, costs you score.
The reward function uses real clinical survival curves from the EMS literature
(Larsen et al. 1993 for cardiac arrest; Saver 2006 "Time is Brain" for stroke;
golden hour curves for trauma). It's deterministic, defensible, and gives a
continuous signal an RL agent can actually learn from.
---
## OpenEnv compliance
| Requirement | Status |
|---|---|
| Real-world task (not games or toys) | β
Emergency dispatch β actual profession |
| Typed Pydantic models inheriting from OpenEnv `Action` / `Observation` / `State` | β
`models.py` |
| `Environment` base-class subclass with `reset()` / `step()` / `state` | β
`server/environment.py` |
| FastAPI server via `create_fastapi_app(...)` | β
`server/app.py` |
| `EnvClient` client with `_step_payload` / `_parse_result` / `_parse_state` | β
`client.py` |
| `openenv.yaml` manifest | β
|
| β₯ 3 tasks with graders, scores 0.0β1.0 | β
easy / medium / hard |
| Meaningful reward + partial progress | β
survival curves + per-step rewards |
| `inference.py` at root, OpenAI client, mandatory env vars, `[START]/[STEP]/[END]` format | β
|
| Reproducible (fixed seed) | β
`seed=42` default everywhere |
| Pre-submission validator script | β
`scripts/validate-submission.sh` |
| Dockerfile + HF Spaces deploy | β
uses `openenv-base` |
| Runs on 2 vCPU / 8 GB RAM | β
pure Python math, no ML inference |
---
## Project layout (canonical OpenEnv structure)
```
DispatchPulse/
βββ README.md
βββ Dockerfile # uses ghcr.io/meta-pytorch/openenv-base
βββ openenv.yaml # OpenEnv manifest
βββ pyproject.toml
βββ inference.py # ROUND 1 ENTRY POINT β must be in root
βββ client.py # DispatchPulseEnv (subclass of EnvClient)
βββ models.py # DispatchPulseAction / Observation / State
β # plus internal sim models
βββ simulation.py # DispatchSimulation engine
βββ reward.py # Survival curves + episode reward
βββ grader.py # Programmatic 0.0β1.0 grader
βββ scenario_loader.py # YAML task loader
βββ text_view.py # LLM-friendly dispatch center renderer
βββ utils.py # Distance / ETA / templates
βββ server/
β βββ __init__.py
β βββ app.py # FastAPI app via create_fastapi_app(...)
β βββ environment.py # DispatchPulseEnvironment(Environment)
βββ tasks/
β βββ easy.yaml
β βββ medium.yaml
β βββ hard.yaml
βββ scripts/
β βββ validate-submission.sh # runs the 3 grader checks locally
βββ tests/
βββ test_reward.py
βββ test_simulation.py
```
---
## Action space (typed Pydantic)
`DispatchPulseAction` has these `action_type` values:
| `action_type` | Required fields | Time cost | What it does |
|---|---|---|---|
| `dispatch` | `call_id`, `unit_id`, `hospital_id?` | 1 min | Send a unit to a call (optionally pre-routing to a hospital). |
| `classify` | `call_id`, `severity` (1-5) | 1 min | Reclassify a call's severity. |
| `callback` | `call_id`, `message` | 1 min | Phone the caller back. 70% chance they clarify the true emergency type. |
| `wait` | `minutes` (default 1, max 5) | n min | Skip ahead in the simulation when there's nothing to do. |
| `view` | β | free | Re-fetch the dispatch center text without advancing time. |
The action also has a free-text `text` field β the server parses lines like
`dispatch CALL-001 ALS-1 H1` so an LLM can produce them directly.
## Observation space
`DispatchPulseObservation` has:
- `text` β formatted dispatch center view (the field the LLM reads)
- `current_time`, `time_limit`
- `calls_pending`, `units_available`, `calls_completed`, `calls_timed_out`, `total_calls`
- `last_action_error` β error string from the previous action, or `None`
- `info_message` β what just happened
- inherited `done`, `reward`, `metadata`
## Tasks
| Task | Calls | Units | Hospitals | Duration | Caller misreporting | What's hard about it |
|---|---|---|---|---|---|---|
| `easy` | 5 | 4 | 1 | 30 min | 0% | Basic dispatch β learn the action grammar |
| `medium` | 15 | 6 | 2 | 45 min | 20% | Mass casualty bus accident at minute 12; some callers lie |
| `hard` | 30 | 8 | 3 (1 on diversion) | 60 min | 35% | Earthquake β extreme scarcity, panicked callers, hospital triage matters |
All three are deterministic given the seed.
---
## Reward function
Final episode score = weighted combination of four components, all in [0, 1]:
| Component | Weight | What it measures |
|---|---|---|
| `survival_score` | 0.60 | Severity-weighted average outcome across all calls (uses clinical survival curves Γ unit effectiveness Γ hospital modifier) |
| `efficiency_score` | 0.15 | Fraction of calls dispatched, penalised for wasting ALS on minor calls |
| `triage_accuracy` | 0.15 | Fraction of severity-1 calls dispatched within 25% of their timeout window |
| `penalty` | β0.10 | Deductions for timed-out criticals and wrong-unit assignments |
Severity weights inside the survival score: **3Γ for severity 1, 2Γ for 2, 1.5Γ for 3, 1Γ for 4, 0.5Γ for 5**.
### Survival curves (from EMS literature)
| Emergency | Curve | Source / notes |
|---|---|---|
| Cardiac arrest | exponential, ~10%/min decay | Larsen et al. 1993 |
| Trauma | sigmoid centred at 45 min | "golden hour" |
| Stroke | exponential decay | Saver 2006 β every minute = 1.9M neurons |
| Fire | exponential, doubles per minute | property loss |
| Breathing difficulty | gentler exponential | |
| Minor injury | nearly flat | stable patient |
| Mental health | gentler exponential | de-escalation success |
Each call's outcome is multiplied by:
- **Unit effectiveness** (e.g., ALS β cardiac = 1.0; BLS β cardiac = 0.5; fire engine β cardiac = 0.1)
- **Hospital modifier** (specialty match: +5%; on diversion or zero beds: β15%)
---
## Baseline scores (heuristic agent, seed=42)
A simple rule-based heuristic (always pick the most-critical call, send the
most effective available unit, reserve ALS for high-severity calls) produces
the following calibrated scores:
| Task | Total | Survival | Efficiency | Triage | Penalty | Completed/Total |
|---|---|---|---|---|---|---|
| easy | 0.5476 | 0.463 | 0.800 | 1.000 | β0.000 | 4/5 |
| medium | 0.3750 | 0.377 | 0.600 | 0.500 | β0.160 | 9/15 |
| hard | 0.2183 | 0.214 | 0.433 | 0.500 | β0.500 | 13/30 |
| **Average** | **0.3803** | | | | | |
The clean monotonic decrease across difficulty (easy > medium > hard) confirms
the env discriminates between scenarios as designed.
---
## Inference script β `inference.py`
Per the hackathon spec, `inference.py` is in the **project root** and follows
the mandatory contract:
### Required environment variables
| Variable | Purpose | Default in script |
|---|---|---|
| `API_BASE_URL` | LLM endpoint | `https://router.huggingface.co/v1` |
| `MODEL_NAME` | Which model to call | `Qwen/Qwen2.5-72B-Instruct` |
| `HF_TOKEN` | API key for the LLM | (no default) |
| `LOCAL_IMAGE_NAME` | Docker image for `from_docker_image()` | (no default) |
| `DISPATCHPULSE_TASK` | Which task to run (`easy`/`medium`/`hard`) | `easy` |
### Stdout format (verbatim)
```
[START] task=<task_name> env=dispatchpulse model=<model_name>
[STEP] step=<n> action=<action_str> reward=<0.00> done=<true|false> error=<msg|null>
[END] success=<true|false> steps=<n> score=<score> rewards=<r1,r2,...,rn>
```
- One `[START]` line at episode begin
- One `[STEP]` line per step, immediately after `env.step()` returns
- One `[END]` line after `env.close()`, ALWAYS emitted (even on exception)
- `reward` and `rewards` to 2 decimal places; `score` to 3 decimal places
- `done` and `success` are lowercase booleans
### Connection logic
1. If `LOCAL_IMAGE_NAME` is set β `await DispatchPulseEnv.from_docker_image(LOCAL_IMAGE_NAME)`
2. Else if `ENV_BASE_URL` is set β connect directly to a running env server
3. Otherwise β spin up an in-process simulation as a fallback (for offline runs)
### Run it
```bash
# Against the live HF Space
ENV_BASE_URL=https://arun-sanjay-dispatchpulse.hf.space \
HF_TOKEN=$HF_TOKEN \
python inference.py
# Against a local Docker image
LOCAL_IMAGE_NAME=dispatchpulse:latest \
HF_TOKEN=$HF_TOKEN \
python inference.py
# In-process fallback (no network, no Docker)
python inference.py
```
---
## Setup
### Run locally with Python
```bash
python -m venv .venv && source .venv/bin/activate
pip install -e .
python inference.py
```
### Run locally with Docker
```bash
docker build -t dispatchpulse .
docker run -p 8000:8000 dispatchpulse
# Then in another shell:
curl http://localhost:8000/health
```
### Use as a client (OpenEnv `EnvClient` pattern)
```python
import asyncio
from client import DispatchPulseEnv
from models import DispatchPulseAction
async def main():
async with DispatchPulseEnv(base_url="https://arun-sanjay-dispatchpulse.hf.space") as env:
result = await env.reset(task_name="easy", seed=42)
while not result.done:
action = DispatchPulseAction(action_type="wait", minutes=1, text="wait 1")
result = await env.step(action)
print(result.observation.text[:200])
print(f"Final score: {result.reward}")
asyncio.run(main())
```
### Run on Hugging Face Spaces
Auto-built as a Docker Space:
[`https://huggingface.co/spaces/Arun-Sanjay/dispatchpulse`](https://huggingface.co/spaces/Arun-Sanjay/dispatchpulse)
---
## Pre-submission validator
Run the same three checks the hackathon's automated grader runs:
```bash
./scripts/validate-submission.sh https://arun-sanjay-dispatchpulse.hf.space .
```
It checks:
1. **HF Space deploys** β `POST /reset` returns HTTP 200
2. **Docker build** β `docker build .` succeeds (β€ 10 min)
3. **OpenEnv compliance** β `openenv validate` passes
---
## Calibration tests
The reward function ships with calibration tests that double as documentation:
```bash
python tests/test_reward.py
python tests/test_simulation.py
```
These verify that:
- Survival curves match published clinical numbers
- A "do-nothing" agent scores below 0.15 on every task
- A simple heuristic strictly outperforms the silent agent
- Heuristic scores monotonically decrease easy β medium β hard
- ALS at cardiac arrest beats fire engine at cardiac arrest by β₯5Γ
- Specialty hospital match boosts outcome; diversion hurts it
---
## License
Apache 2.0. Built for the Meta PyTorch OpenEnv Hackathon β India 2026.
|