Spaces:
Sleeping
Sleeping
| title: DispatchPulse | |
| emoji: π | |
| colorFrom: red | |
| colorTo: blue | |
| sdk: docker | |
| app_port: 8000 | |
| pinned: false | |
| license: apache-2.0 | |
| # DispatchPulse | |
| **An OpenEnv environment where an AI agent acts as a 911 emergency dispatch coordinator.** | |
| The agent receives incoming calls, classifies their severity, and dispatches limited | |
| emergency units (ALS / BLS ambulances, fire engines, police) under time pressure. | |
| Patient outcomes are scored against **real clinical survival curves** β no | |
| LLM-as-judge, just defensible math. | |
| > Submission for the [Meta PyTorch OpenEnv Hackathon β India 2026](https://www.scaler.com/school-of-technology/meta-pytorch-hackathon). | |
| --- | |
| ## Why this environment | |
| In India, an estimated 24,000+ people die every day because of slow emergency | |
| response β average ambulance time is 25β35 minutes, well beyond the golden hour, | |
| and only ~20% of ambulances carry advanced life support. DispatchPulse simulates | |
| this crisis as an interactive RL environment where the agent has to learn the | |
| *counter-intuitive* strategies real dispatchers use: | |
| - **The greedy "closest unit" strategy fails.** Dispatching the only ALS to a | |
| sprained ankle leaves nothing for the cardiac arrest that arrives 3 minutes | |
| later β survival drops from 70% to 15%. | |
| - **Triage matters more than speed.** A weighted reward (severity 1 calls | |
| count 3Γ more than severity 4) means the agent has to *prioritise*, not | |
| just react. | |
| - **Hospital choice matters.** Sending a stroke patient to a hospital without | |
| a stroke unit, or to one on diversion, costs you score. | |
| The reward function uses real clinical survival curves from the EMS literature | |
| (Larsen et al. 1993 for cardiac arrest; Saver 2006 "Time is Brain" for stroke; | |
| golden hour curves for trauma). It's deterministic, defensible, and gives a | |
| continuous signal an RL agent can actually learn from. | |
| --- | |
| ## OpenEnv compliance | |
| | Requirement | Status | | |
| |---|---| | |
| | Real-world task (not games or toys) | β Emergency dispatch β actual profession | | |
| | Typed Pydantic models inheriting from OpenEnv `Action` / `Observation` / `State` | β `models.py` | | |
| | `Environment` base-class subclass with `reset()` / `step()` / `state` | β `server/environment.py` | | |
| | FastAPI server via `create_fastapi_app(...)` | β `server/app.py` | | |
| | `EnvClient` client with `_step_payload` / `_parse_result` / `_parse_state` | β `client.py` | | |
| | `openenv.yaml` manifest | β | | |
| | β₯ 3 tasks with graders, scores 0.0β1.0 | β easy / medium / hard | | |
| | Meaningful reward + partial progress | β survival curves + per-step rewards | | |
| | `inference.py` at root, OpenAI client, mandatory env vars, `[START]/[STEP]/[END]` format | β | | |
| | Reproducible (fixed seed) | β `seed=42` default everywhere | | |
| | Pre-submission validator script | β `scripts/validate-submission.sh` | | |
| | Dockerfile + HF Spaces deploy | β uses `openenv-base` | | |
| | Runs on 2 vCPU / 8 GB RAM | β pure Python math, no ML inference | | |
| --- | |
| ## Project layout (canonical OpenEnv structure) | |
| ``` | |
| DispatchPulse/ | |
| βββ README.md | |
| βββ Dockerfile # uses ghcr.io/meta-pytorch/openenv-base | |
| βββ openenv.yaml # OpenEnv manifest | |
| βββ pyproject.toml | |
| βββ inference.py # ROUND 1 ENTRY POINT β must be in root | |
| βββ client.py # DispatchPulseEnv (subclass of EnvClient) | |
| βββ models.py # DispatchPulseAction / Observation / State | |
| β # plus internal sim models | |
| βββ simulation.py # DispatchSimulation engine | |
| βββ reward.py # Survival curves + episode reward | |
| βββ grader.py # Programmatic 0.0β1.0 grader | |
| βββ scenario_loader.py # YAML task loader | |
| βββ text_view.py # LLM-friendly dispatch center renderer | |
| βββ utils.py # Distance / ETA / templates | |
| βββ server/ | |
| β βββ __init__.py | |
| β βββ app.py # FastAPI app via create_fastapi_app(...) | |
| β βββ environment.py # DispatchPulseEnvironment(Environment) | |
| βββ tasks/ | |
| β βββ easy.yaml | |
| β βββ medium.yaml | |
| β βββ hard.yaml | |
| βββ scripts/ | |
| β βββ validate-submission.sh # runs the 3 grader checks locally | |
| βββ tests/ | |
| βββ test_reward.py | |
| βββ test_simulation.py | |
| ``` | |
| --- | |
| ## Action space (typed Pydantic) | |
| `DispatchPulseAction` has these `action_type` values: | |
| | `action_type` | Required fields | Time cost | What it does | | |
| |---|---|---|---| | |
| | `dispatch` | `call_id`, `unit_id`, `hospital_id?` | 1 min | Send a unit to a call (optionally pre-routing to a hospital). | | |
| | `classify` | `call_id`, `severity` (1-5) | 1 min | Reclassify a call's severity. | | |
| | `callback` | `call_id`, `message` | 1 min | Phone the caller back. 70% chance they clarify the true emergency type. | | |
| | `wait` | `minutes` (default 1, max 5) | n min | Skip ahead in the simulation when there's nothing to do. | | |
| | `view` | β | free | Re-fetch the dispatch center text without advancing time. | | |
| The action also has a free-text `text` field β the server parses lines like | |
| `dispatch CALL-001 ALS-1 H1` so an LLM can produce them directly. | |
| ## Observation space | |
| `DispatchPulseObservation` has: | |
| - `text` β formatted dispatch center view (the field the LLM reads) | |
| - `current_time`, `time_limit` | |
| - `calls_pending`, `units_available`, `calls_completed`, `calls_timed_out`, `total_calls` | |
| - `last_action_error` β error string from the previous action, or `None` | |
| - `info_message` β what just happened | |
| - inherited `done`, `reward`, `metadata` | |
| ## Tasks | |
| | Task | Calls | Units | Hospitals | Duration | Caller misreporting | What's hard about it | | |
| |---|---|---|---|---|---|---| | |
| | `easy` | 5 | 4 | 1 | 30 min | 0% | Basic dispatch β learn the action grammar | | |
| | `medium` | 15 | 6 | 2 | 45 min | 20% | Mass casualty bus accident at minute 12; some callers lie | | |
| | `hard` | 30 | 8 | 3 (1 on diversion) | 60 min | 35% | Earthquake β extreme scarcity, panicked callers, hospital triage matters | | |
| All three are deterministic given the seed. | |
| --- | |
| ## Reward function | |
| Final episode score = weighted combination of four components, all in [0, 1]: | |
| | Component | Weight | What it measures | | |
| |---|---|---| | |
| | `survival_score` | 0.60 | Severity-weighted average outcome across all calls (uses clinical survival curves Γ unit effectiveness Γ hospital modifier) | | |
| | `efficiency_score` | 0.15 | Fraction of calls dispatched, penalised for wasting ALS on minor calls | | |
| | `triage_accuracy` | 0.15 | Fraction of severity-1 calls dispatched within 25% of their timeout window | | |
| | `penalty` | β0.10 | Deductions for timed-out criticals and wrong-unit assignments | | |
| Severity weights inside the survival score: **3Γ for severity 1, 2Γ for 2, 1.5Γ for 3, 1Γ for 4, 0.5Γ for 5**. | |
| ### Survival curves (from EMS literature) | |
| | Emergency | Curve | Source / notes | | |
| |---|---|---| | |
| | Cardiac arrest | exponential, ~10%/min decay | Larsen et al. 1993 | | |
| | Trauma | sigmoid centred at 45 min | "golden hour" | | |
| | Stroke | exponential decay | Saver 2006 β every minute = 1.9M neurons | | |
| | Fire | exponential, doubles per minute | property loss | | |
| | Breathing difficulty | gentler exponential | | | |
| | Minor injury | nearly flat | stable patient | | |
| | Mental health | gentler exponential | de-escalation success | | |
| Each call's outcome is multiplied by: | |
| - **Unit effectiveness** (e.g., ALS β cardiac = 1.0; BLS β cardiac = 0.5; fire engine β cardiac = 0.1) | |
| - **Hospital modifier** (specialty match: +5%; on diversion or zero beds: β15%) | |
| --- | |
| ## Baseline scores (heuristic agent, seed=42) | |
| A simple rule-based heuristic (always pick the most-critical call, send the | |
| most effective available unit, reserve ALS for high-severity calls) produces | |
| the following calibrated scores: | |
| | Task | Total | Survival | Efficiency | Triage | Penalty | Completed/Total | | |
| |---|---|---|---|---|---|---| | |
| | easy | 0.5476 | 0.463 | 0.800 | 1.000 | β0.000 | 4/5 | | |
| | medium | 0.3750 | 0.377 | 0.600 | 0.500 | β0.160 | 9/15 | | |
| | hard | 0.2183 | 0.214 | 0.433 | 0.500 | β0.500 | 13/30 | | |
| | **Average** | **0.3803** | | | | | | | |
| The clean monotonic decrease across difficulty (easy > medium > hard) confirms | |
| the env discriminates between scenarios as designed. | |
| --- | |
| ## Inference script β `inference.py` | |
| Per the hackathon spec, `inference.py` is in the **project root** and follows | |
| the mandatory contract: | |
| ### Required environment variables | |
| | Variable | Purpose | Default in script | | |
| |---|---|---| | |
| | `API_BASE_URL` | LLM endpoint | `https://router.huggingface.co/v1` | | |
| | `MODEL_NAME` | Which model to call | `Qwen/Qwen2.5-72B-Instruct` | | |
| | `HF_TOKEN` | API key for the LLM | (no default) | | |
| | `LOCAL_IMAGE_NAME` | Docker image for `from_docker_image()` | (no default) | | |
| | `DISPATCHPULSE_TASK` | Which task to run (`easy`/`medium`/`hard`) | `easy` | | |
| ### Stdout format (verbatim) | |
| ``` | |
| [START] task=<task_name> env=dispatchpulse model=<model_name> | |
| [STEP] step=<n> action=<action_str> reward=<0.00> done=<true|false> error=<msg|null> | |
| [END] success=<true|false> steps=<n> score=<score> rewards=<r1,r2,...,rn> | |
| ``` | |
| - One `[START]` line at episode begin | |
| - One `[STEP]` line per step, immediately after `env.step()` returns | |
| - One `[END]` line after `env.close()`, ALWAYS emitted (even on exception) | |
| - `reward` and `rewards` to 2 decimal places; `score` to 3 decimal places | |
| - `done` and `success` are lowercase booleans | |
| ### Connection logic | |
| 1. If `LOCAL_IMAGE_NAME` is set β `await DispatchPulseEnv.from_docker_image(LOCAL_IMAGE_NAME)` | |
| 2. Else if `ENV_BASE_URL` is set β connect directly to a running env server | |
| 3. Otherwise β spin up an in-process simulation as a fallback (for offline runs) | |
| ### Run it | |
| ```bash | |
| # Against the live HF Space | |
| ENV_BASE_URL=https://arun-sanjay-dispatchpulse.hf.space \ | |
| HF_TOKEN=$HF_TOKEN \ | |
| python inference.py | |
| # Against a local Docker image | |
| LOCAL_IMAGE_NAME=dispatchpulse:latest \ | |
| HF_TOKEN=$HF_TOKEN \ | |
| python inference.py | |
| # In-process fallback (no network, no Docker) | |
| python inference.py | |
| ``` | |
| --- | |
| ## Setup | |
| ### Run locally with Python | |
| ```bash | |
| python -m venv .venv && source .venv/bin/activate | |
| pip install -e . | |
| python inference.py | |
| ``` | |
| ### Run locally with Docker | |
| ```bash | |
| docker build -t dispatchpulse . | |
| docker run -p 8000:8000 dispatchpulse | |
| # Then in another shell: | |
| curl http://localhost:8000/health | |
| ``` | |
| ### Use as a client (OpenEnv `EnvClient` pattern) | |
| ```python | |
| import asyncio | |
| from client import DispatchPulseEnv | |
| from models import DispatchPulseAction | |
| async def main(): | |
| async with DispatchPulseEnv(base_url="https://arun-sanjay-dispatchpulse.hf.space") as env: | |
| result = await env.reset(task_name="easy", seed=42) | |
| while not result.done: | |
| action = DispatchPulseAction(action_type="wait", minutes=1, text="wait 1") | |
| result = await env.step(action) | |
| print(result.observation.text[:200]) | |
| print(f"Final score: {result.reward}") | |
| asyncio.run(main()) | |
| ``` | |
| ### Run on Hugging Face Spaces | |
| Auto-built as a Docker Space: | |
| [`https://huggingface.co/spaces/Arun-Sanjay/dispatchpulse`](https://huggingface.co/spaces/Arun-Sanjay/dispatchpulse) | |
| --- | |
| ## Pre-submission validator | |
| Run the same three checks the hackathon's automated grader runs: | |
| ```bash | |
| ./scripts/validate-submission.sh https://arun-sanjay-dispatchpulse.hf.space . | |
| ``` | |
| It checks: | |
| 1. **HF Space deploys** β `POST /reset` returns HTTP 200 | |
| 2. **Docker build** β `docker build .` succeeds (β€ 10 min) | |
| 3. **OpenEnv compliance** β `openenv validate` passes | |
| --- | |
| ## Calibration tests | |
| The reward function ships with calibration tests that double as documentation: | |
| ```bash | |
| python tests/test_reward.py | |
| python tests/test_simulation.py | |
| ``` | |
| These verify that: | |
| - Survival curves match published clinical numbers | |
| - A "do-nothing" agent scores below 0.15 on every task | |
| - A simple heuristic strictly outperforms the silent agent | |
| - Heuristic scores monotonically decrease easy β medium β hard | |
| - ALS at cardiac arrest beats fire engine at cardiac arrest by β₯5Γ | |
| - Specialty hospital match boosts outcome; diversion hurts it | |
| --- | |
| ## License | |
| Apache 2.0. Built for the Meta PyTorch OpenEnv Hackathon β India 2026. | |