Spaces:

Arun-Sanjay
/

dispatchpulse

Sleeping

App Files Files Community

dispatchpulse / README.md

Arun-Sanjay

Update README for canonical OpenEnv structure and submission spec

538acd4 about 1 month ago

preview code

raw

history blame contribute delete

12.2 kB

metadata

title: DispatchPulse
emoji: 🚑
colorFrom: red
colorTo: blue
sdk: docker
app_port: 8000
pinned: false
license: apache-2.0

DispatchPulse

An OpenEnv environment where an AI agent acts as a 911 emergency dispatch coordinator. The agent receives incoming calls, classifies their severity, and dispatches limited emergency units (ALS / BLS ambulances, fire engines, police) under time pressure. Patient outcomes are scored against real clinical survival curves — no LLM-as-judge, just defensible math.

Submission for the Meta PyTorch OpenEnv Hackathon — India 2026.

Why this environment

In India, an estimated 24,000+ people die every day because of slow emergency response — average ambulance time is 25–35 minutes, well beyond the golden hour, and only ~20% of ambulances carry advanced life support. DispatchPulse simulates this crisis as an interactive RL environment where the agent has to learn the counter-intuitive strategies real dispatchers use:

The greedy "closest unit" strategy fails. Dispatching the only ALS to a sprained ankle leaves nothing for the cardiac arrest that arrives 3 minutes later — survival drops from 70% to 15%.
Triage matters more than speed. A weighted reward (severity 1 calls count 3× more than severity 4) means the agent has to prioritise, not just react.
Hospital choice matters. Sending a stroke patient to a hospital without a stroke unit, or to one on diversion, costs you score.

The reward function uses real clinical survival curves from the EMS literature (Larsen et al. 1993 for cardiac arrest; Saver 2006 "Time is Brain" for stroke; golden hour curves for trauma). It's deterministic, defensible, and gives a continuous signal an RL agent can actually learn from.

OpenEnv compliance

Requirement	Status
Real-world task (not games or toys)	✅ Emergency dispatch — actual profession
Typed Pydantic models inheriting from OpenEnv `Action` / `Observation` / `State`	✅ `models.py`
`Environment` base-class subclass with `reset()` / `step()` / `state`	✅ `server/environment.py`
FastAPI server via `create_fastapi_app(...)`	✅ `server/app.py`
`EnvClient` client with `_step_payload` / `_parse_result` / `_parse_state`	✅ `client.py`
`openenv.yaml` manifest	✅
≥ 3 tasks with graders, scores 0.0–1.0	✅ easy / medium / hard
Meaningful reward + partial progress	✅ survival curves + per-step rewards
`inference.py` at root, OpenAI client, mandatory env vars, `[START]/[STEP]/[END]` format	✅
Reproducible (fixed seed)	✅ `seed=42` default everywhere
Pre-submission validator script	✅ `scripts/validate-submission.sh`
Dockerfile + HF Spaces deploy	✅ uses `openenv-base`
Runs on 2 vCPU / 8 GB RAM	✅ pure Python math, no ML inference

Project layout (canonical OpenEnv structure)

DispatchPulse/
├── README.md
├── Dockerfile               # uses ghcr.io/meta-pytorch/openenv-base
├── openenv.yaml             # OpenEnv manifest
├── pyproject.toml
├── inference.py             # ROUND 1 ENTRY POINT — must be in root
├── client.py                # DispatchPulseEnv (subclass of EnvClient)
├── models.py                # DispatchPulseAction / Observation / State
│                            # plus internal sim models
├── simulation.py            # DispatchSimulation engine
├── reward.py                # Survival curves + episode reward
├── grader.py                # Programmatic 0.0–1.0 grader
├── scenario_loader.py       # YAML task loader
├── text_view.py             # LLM-friendly dispatch center renderer
├── utils.py                 # Distance / ETA / templates
├── server/
│   ├── __init__.py
│   ├── app.py               # FastAPI app via create_fastapi_app(...)
│   └── environment.py       # DispatchPulseEnvironment(Environment)
├── tasks/
│   ├── easy.yaml
│   ├── medium.yaml
│   └── hard.yaml
├── scripts/
│   └── validate-submission.sh   # runs the 3 grader checks locally
└── tests/
    ├── test_reward.py
    └── test_simulation.py

Action space (typed Pydantic)

DispatchPulseAction has these action_type values:

`action_type`	Required fields	Time cost	What it does
`dispatch`	`call_id`, `unit_id`, `hospital_id?`	1 min	Send a unit to a call (optionally pre-routing to a hospital).
`classify`	`call_id`, `severity` (1-5)	1 min	Reclassify a call's severity.
`callback`	`call_id`, `message`	1 min	Phone the caller back. 70% chance they clarify the true emergency type.
`wait`	`minutes` (default 1, max 5)	n min	Skip ahead in the simulation when there's nothing to do.
`view`	—	free	Re-fetch the dispatch center text without advancing time.

The action also has a free-text text field — the server parses lines like dispatch CALL-001 ALS-1 H1 so an LLM can produce them directly.

Observation space

DispatchPulseObservation has:

text — formatted dispatch center view (the field the LLM reads)
current_time, time_limit
calls_pending, units_available, calls_completed, calls_timed_out, total_calls
last_action_error — error string from the previous action, or None
info_message — what just happened
inherited done, reward, metadata

Tasks

Task	Calls	Units	Hospitals	Duration	Caller misreporting	What's hard about it
`easy`	5	4	1	30 min	0%	Basic dispatch — learn the action grammar
`medium`	15	6	2	45 min	20%	Mass casualty bus accident at minute 12; some callers lie
`hard`	30	8	3 (1 on diversion)	60 min	35%	Earthquake — extreme scarcity, panicked callers, hospital triage matters

All three are deterministic given the seed.

Reward function

Final episode score = weighted combination of four components, all in [0, 1]:

Component	Weight	What it measures
`survival_score`	0.60	Severity-weighted average outcome across all calls (uses clinical survival curves × unit effectiveness × hospital modifier)
`efficiency_score`	0.15	Fraction of calls dispatched, penalised for wasting ALS on minor calls
`triage_accuracy`	0.15	Fraction of severity-1 calls dispatched within 25% of their timeout window
`penalty`	−0.10	Deductions for timed-out criticals and wrong-unit assignments

Severity weights inside the survival score: 3× for severity 1, 2× for 2, 1.5× for 3, 1× for 4, 0.5× for 5.

Survival curves (from EMS literature)

Emergency	Curve	Source / notes
Cardiac arrest	exponential, ~10%/min decay	Larsen et al. 1993
Trauma	sigmoid centred at 45 min	"golden hour"
Stroke	exponential decay	Saver 2006 — every minute = 1.9M neurons
Fire	exponential, doubles per minute	property loss
Breathing difficulty	gentler exponential
Minor injury	nearly flat	stable patient
Mental health	gentler exponential	de-escalation success

Each call's outcome is multiplied by:

Unit effectiveness (e.g., ALS → cardiac = 1.0; BLS → cardiac = 0.5; fire engine → cardiac = 0.1)
Hospital modifier (specialty match: +5%; on diversion or zero beds: −15%)

Baseline scores (heuristic agent, seed=42)

A simple rule-based heuristic (always pick the most-critical call, send the most effective available unit, reserve ALS for high-severity calls) produces the following calibrated scores:

Task	Total	Survival	Efficiency	Triage	Penalty	Completed/Total
easy	0.5476	0.463	0.800	1.000	−0.000	4/5
medium	0.3750	0.377	0.600	0.500	−0.160	9/15
hard	0.2183	0.214	0.433	0.500	−0.500	13/30
Average	0.3803

The clean monotonic decrease across difficulty (easy > medium > hard) confirms the env discriminates between scenarios as designed.

Inference script — `inference.py`

Per the hackathon spec, inference.py is in the project root and follows the mandatory contract:

Required environment variables

Variable	Purpose	Default in script
`API_BASE_URL`	LLM endpoint	`https://router.huggingface.co/v1`
`MODEL_NAME`	Which model to call	`Qwen/Qwen2.5-72B-Instruct`
`HF_TOKEN`	API key for the LLM	(no default)
`LOCAL_IMAGE_NAME`	Docker image for `from_docker_image()`	(no default)
`DISPATCHPULSE_TASK`	Which task to run (`easy`/`medium`/`hard`)	`easy`

Stdout format (verbatim)

[START] task=<task_name> env=dispatchpulse model=<model_name>
[STEP] step=<n> action=<action_str> reward=<0.00> done=<true|false> error=<msg|null>
[END] success=<true|false> steps=<n> score=<score> rewards=<r1,r2,...,rn>

One [START] line at episode begin
One [STEP] line per step, immediately after env.step() returns
One [END] line after env.close(), ALWAYS emitted (even on exception)
reward and rewards to 2 decimal places; score to 3 decimal places
done and success are lowercase booleans

Connection logic

If LOCAL_IMAGE_NAME is set → await DispatchPulseEnv.from_docker_image(LOCAL_IMAGE_NAME)
Else if ENV_BASE_URL is set → connect directly to a running env server
Otherwise → spin up an in-process simulation as a fallback (for offline runs)

Run it

# Against the live HF Space
ENV_BASE_URL=https://arun-sanjay-dispatchpulse.hf.space \
HF_TOKEN=$HF_TOKEN \
python inference.py

# Against a local Docker image
LOCAL_IMAGE_NAME=dispatchpulse:latest \
HF_TOKEN=$HF_TOKEN \
python inference.py

# In-process fallback (no network, no Docker)
python inference.py

Setup

Run locally with Python

python -m venv .venv && source .venv/bin/activate
pip install -e .
python inference.py

Run locally with Docker

docker build -t dispatchpulse .
docker run -p 8000:8000 dispatchpulse
# Then in another shell:
curl http://localhost:8000/health

Use as a client (OpenEnv `EnvClient` pattern)

import asyncio
from client import DispatchPulseEnv
from models import DispatchPulseAction

async def main():
    async with DispatchPulseEnv(base_url="https://arun-sanjay-dispatchpulse.hf.space") as env:
        result = await env.reset(task_name="easy", seed=42)
        while not result.done:
            action = DispatchPulseAction(action_type="wait", minutes=1, text="wait 1")
            result = await env.step(action)
            print(result.observation.text[:200])
        print(f"Final score: {result.reward}")

asyncio.run(main())

Run on Hugging Face Spaces

Auto-built as a Docker Space: https://huggingface.co/spaces/Arun-Sanjay/dispatchpulse

Pre-submission validator

Run the same three checks the hackathon's automated grader runs:

./scripts/validate-submission.sh https://arun-sanjay-dispatchpulse.hf.space .

It checks:

HF Space deploys — POST /reset returns HTTP 200
Docker build — docker build . succeeds (≤ 10 min)
OpenEnv compliance — openenv validate passes

Calibration tests

The reward function ships with calibration tests that double as documentation:

python tests/test_reward.py
python tests/test_simulation.py

These verify that:

Survival curves match published clinical numbers
A "do-nothing" agent scores below 0.15 on every task
A simple heuristic strictly outperforms the silent agent
Heuristic scores monotonically decrease easy → medium → hard
ALS at cardiac arrest beats fire engine at cardiac arrest by ≥5×
Specialty hospital match boosts outcome; diversion hurts it

License

Apache 2.0. Built for the Meta PyTorch OpenEnv Hackathon — India 2026.