--- title: DispatchPulse emoji: 🚑 colorFrom: red colorTo: blue sdk: docker app_port: 8000 pinned: false license: apache-2.0 --- # DispatchPulse **An OpenEnv environment where an AI agent acts as a 911 emergency dispatch coordinator.** The agent receives incoming calls, classifies their severity, and dispatches limited emergency units (ALS / BLS ambulances, fire engines, police) under time pressure. Patient outcomes are scored against **real clinical survival curves** — no LLM-as-judge, just defensible math. > Submission for the [Meta PyTorch OpenEnv Hackathon — India 2026](https://www.scaler.com/school-of-technology/meta-pytorch-hackathon). --- ## Why this environment In India, an estimated 24,000+ people die every day because of slow emergency response — average ambulance time is 25–35 minutes, well beyond the golden hour, and only ~20% of ambulances carry advanced life support. DispatchPulse simulates this crisis as an interactive RL environment where the agent has to learn the *counter-intuitive* strategies real dispatchers use: - **The greedy "closest unit" strategy fails.** Dispatching the only ALS to a sprained ankle leaves nothing for the cardiac arrest that arrives 3 minutes later — survival drops from 70% to 15%. - **Triage matters more than speed.** A weighted reward (severity 1 calls count 3× more than severity 4) means the agent has to *prioritise*, not just react. - **Hospital choice matters.** Sending a stroke patient to a hospital without a stroke unit, or to one on diversion, costs you score. The reward function uses real clinical survival curves from the EMS literature (Larsen et al. 1993 for cardiac arrest; Saver 2006 "Time is Brain" for stroke; golden hour curves for trauma). It's deterministic, defensible, and gives a continuous signal an RL agent can actually learn from. --- ## OpenEnv compliance | Requirement | Status | |---|---| | Real-world task (not games or toys) | ✅ Emergency dispatch — actual profession | | Typed Pydantic models inheriting from OpenEnv `Action` / `Observation` / `State` | ✅ `models.py` | | `Environment` base-class subclass with `reset()` / `step()` / `state` | ✅ `server/environment.py` | | FastAPI server via `create_fastapi_app(...)` | ✅ `server/app.py` | | `EnvClient` client with `_step_payload` / `_parse_result` / `_parse_state` | ✅ `client.py` | | `openenv.yaml` manifest | ✅ | | ≥ 3 tasks with graders, scores 0.0–1.0 | ✅ easy / medium / hard | | Meaningful reward + partial progress | ✅ survival curves + per-step rewards | | `inference.py` at root, OpenAI client, mandatory env vars, `[START]/[STEP]/[END]` format | ✅ | | Reproducible (fixed seed) | ✅ `seed=42` default everywhere | | Pre-submission validator script | ✅ `scripts/validate-submission.sh` | | Dockerfile + HF Spaces deploy | ✅ uses `openenv-base` | | Runs on 2 vCPU / 8 GB RAM | ✅ pure Python math, no ML inference | --- ## Project layout (canonical OpenEnv structure) ``` DispatchPulse/ ├── README.md ├── Dockerfile # uses ghcr.io/meta-pytorch/openenv-base ├── openenv.yaml # OpenEnv manifest ├── pyproject.toml ├── inference.py # ROUND 1 ENTRY POINT — must be in root ├── client.py # DispatchPulseEnv (subclass of EnvClient) ├── models.py # DispatchPulseAction / Observation / State │ # plus internal sim models ├── simulation.py # DispatchSimulation engine ├── reward.py # Survival curves + episode reward ├── grader.py # Programmatic 0.0–1.0 grader ├── scenario_loader.py # YAML task loader ├── text_view.py # LLM-friendly dispatch center renderer ├── utils.py # Distance / ETA / templates ├── server/ │ ├── __init__.py │ ├── app.py # FastAPI app via create_fastapi_app(...) │ └── environment.py # DispatchPulseEnvironment(Environment) ├── tasks/ │ ├── easy.yaml │ ├── medium.yaml │ └── hard.yaml ├── scripts/ │ └── validate-submission.sh # runs the 3 grader checks locally └── tests/ ├── test_reward.py └── test_simulation.py ``` --- ## Action space (typed Pydantic) `DispatchPulseAction` has these `action_type` values: | `action_type` | Required fields | Time cost | What it does | |---|---|---|---| | `dispatch` | `call_id`, `unit_id`, `hospital_id?` | 1 min | Send a unit to a call (optionally pre-routing to a hospital). | | `classify` | `call_id`, `severity` (1-5) | 1 min | Reclassify a call's severity. | | `callback` | `call_id`, `message` | 1 min | Phone the caller back. 70% chance they clarify the true emergency type. | | `wait` | `minutes` (default 1, max 5) | n min | Skip ahead in the simulation when there's nothing to do. | | `view` | — | free | Re-fetch the dispatch center text without advancing time. | The action also has a free-text `text` field — the server parses lines like `dispatch CALL-001 ALS-1 H1` so an LLM can produce them directly. ## Observation space `DispatchPulseObservation` has: - `text` — formatted dispatch center view (the field the LLM reads) - `current_time`, `time_limit` - `calls_pending`, `units_available`, `calls_completed`, `calls_timed_out`, `total_calls` - `last_action_error` — error string from the previous action, or `None` - `info_message` — what just happened - inherited `done`, `reward`, `metadata` ## Tasks | Task | Calls | Units | Hospitals | Duration | Caller misreporting | What's hard about it | |---|---|---|---|---|---|---| | `easy` | 5 | 4 | 1 | 30 min | 0% | Basic dispatch — learn the action grammar | | `medium` | 15 | 6 | 2 | 45 min | 20% | Mass casualty bus accident at minute 12; some callers lie | | `hard` | 30 | 8 | 3 (1 on diversion) | 60 min | 35% | Earthquake — extreme scarcity, panicked callers, hospital triage matters | All three are deterministic given the seed. --- ## Reward function Final episode score = weighted combination of four components, all in [0, 1]: | Component | Weight | What it measures | |---|---|---| | `survival_score` | 0.60 | Severity-weighted average outcome across all calls (uses clinical survival curves × unit effectiveness × hospital modifier) | | `efficiency_score` | 0.15 | Fraction of calls dispatched, penalised for wasting ALS on minor calls | | `triage_accuracy` | 0.15 | Fraction of severity-1 calls dispatched within 25% of their timeout window | | `penalty` | −0.10 | Deductions for timed-out criticals and wrong-unit assignments | Severity weights inside the survival score: **3× for severity 1, 2× for 2, 1.5× for 3, 1× for 4, 0.5× for 5**. ### Survival curves (from EMS literature) | Emergency | Curve | Source / notes | |---|---|---| | Cardiac arrest | exponential, ~10%/min decay | Larsen et al. 1993 | | Trauma | sigmoid centred at 45 min | "golden hour" | | Stroke | exponential decay | Saver 2006 — every minute = 1.9M neurons | | Fire | exponential, doubles per minute | property loss | | Breathing difficulty | gentler exponential | | | Minor injury | nearly flat | stable patient | | Mental health | gentler exponential | de-escalation success | Each call's outcome is multiplied by: - **Unit effectiveness** (e.g., ALS → cardiac = 1.0; BLS → cardiac = 0.5; fire engine → cardiac = 0.1) - **Hospital modifier** (specialty match: +5%; on diversion or zero beds: −15%) --- ## Baseline scores (heuristic agent, seed=42) A simple rule-based heuristic (always pick the most-critical call, send the most effective available unit, reserve ALS for high-severity calls) produces the following calibrated scores: | Task | Total | Survival | Efficiency | Triage | Penalty | Completed/Total | |---|---|---|---|---|---|---| | easy | 0.5476 | 0.463 | 0.800 | 1.000 | −0.000 | 4/5 | | medium | 0.3750 | 0.377 | 0.600 | 0.500 | −0.160 | 9/15 | | hard | 0.2183 | 0.214 | 0.433 | 0.500 | −0.500 | 13/30 | | **Average** | **0.3803** | | | | | | The clean monotonic decrease across difficulty (easy > medium > hard) confirms the env discriminates between scenarios as designed. --- ## Inference script — `inference.py` Per the hackathon spec, `inference.py` is in the **project root** and follows the mandatory contract: ### Required environment variables | Variable | Purpose | Default in script | |---|---|---| | `API_BASE_URL` | LLM endpoint | `https://router.huggingface.co/v1` | | `MODEL_NAME` | Which model to call | `Qwen/Qwen2.5-72B-Instruct` | | `HF_TOKEN` | API key for the LLM | (no default) | | `LOCAL_IMAGE_NAME` | Docker image for `from_docker_image()` | (no default) | | `DISPATCHPULSE_TASK` | Which task to run (`easy`/`medium`/`hard`) | `easy` | ### Stdout format (verbatim) ``` [START] task= env=dispatchpulse model= [STEP] step= action= reward=<0.00> done= error= [END] success= steps= score= rewards= ``` - One `[START]` line at episode begin - One `[STEP]` line per step, immediately after `env.step()` returns - One `[END]` line after `env.close()`, ALWAYS emitted (even on exception) - `reward` and `rewards` to 2 decimal places; `score` to 3 decimal places - `done` and `success` are lowercase booleans ### Connection logic 1. If `LOCAL_IMAGE_NAME` is set → `await DispatchPulseEnv.from_docker_image(LOCAL_IMAGE_NAME)` 2. Else if `ENV_BASE_URL` is set → connect directly to a running env server 3. Otherwise → spin up an in-process simulation as a fallback (for offline runs) ### Run it ```bash # Against the live HF Space ENV_BASE_URL=https://arun-sanjay-dispatchpulse.hf.space \ HF_TOKEN=$HF_TOKEN \ python inference.py # Against a local Docker image LOCAL_IMAGE_NAME=dispatchpulse:latest \ HF_TOKEN=$HF_TOKEN \ python inference.py # In-process fallback (no network, no Docker) python inference.py ``` --- ## Setup ### Run locally with Python ```bash python -m venv .venv && source .venv/bin/activate pip install -e . python inference.py ``` ### Run locally with Docker ```bash docker build -t dispatchpulse . docker run -p 8000:8000 dispatchpulse # Then in another shell: curl http://localhost:8000/health ``` ### Use as a client (OpenEnv `EnvClient` pattern) ```python import asyncio from client import DispatchPulseEnv from models import DispatchPulseAction async def main(): async with DispatchPulseEnv(base_url="https://arun-sanjay-dispatchpulse.hf.space") as env: result = await env.reset(task_name="easy", seed=42) while not result.done: action = DispatchPulseAction(action_type="wait", minutes=1, text="wait 1") result = await env.step(action) print(result.observation.text[:200]) print(f"Final score: {result.reward}") asyncio.run(main()) ``` ### Run on Hugging Face Spaces Auto-built as a Docker Space: [`https://huggingface.co/spaces/Arun-Sanjay/dispatchpulse`](https://huggingface.co/spaces/Arun-Sanjay/dispatchpulse) --- ## Pre-submission validator Run the same three checks the hackathon's automated grader runs: ```bash ./scripts/validate-submission.sh https://arun-sanjay-dispatchpulse.hf.space . ``` It checks: 1. **HF Space deploys** — `POST /reset` returns HTTP 200 2. **Docker build** — `docker build .` succeeds (≤ 10 min) 3. **OpenEnv compliance** — `openenv validate` passes --- ## Calibration tests The reward function ships with calibration tests that double as documentation: ```bash python tests/test_reward.py python tests/test_simulation.py ``` These verify that: - Survival curves match published clinical numbers - A "do-nothing" agent scores below 0.15 on every task - A simple heuristic strictly outperforms the silent agent - Heuristic scores monotonically decrease easy → medium → hard - ALS at cardiac arrest beats fire engine at cardiac arrest by ≥5× - Specialty hospital match boosts outcome; diversion hurts it --- ## License Apache 2.0. Built for the Meta PyTorch OpenEnv Hackathon — India 2026.