Spaces:
Sleeping
Sleeping
| title: Pyre — Crisis Navigation Environment | |
| emoji: 🔥 | |
| colorFrom: red | |
| colorTo: yellow | |
| sdk: docker | |
| pinned: false | |
| app_port: 8000 | |
| tags: | |
| - openenv | |
| # Pyre — Crisis Navigation Environment for LLM Agents | |
| > *When buildings burn, the difference between a safe evacuation and a tragedy is the quality of decisions made in the first 60 seconds. Can we train an LLM to make them?* | |
| **Pyre** places an LLM agent *inside* a burning building. The agent must navigate to safety under partial observability — no global map, hard time pressure, and a fire that actively spreads and blocks exits. | |
| --- | |
| ## Why Pyre vs. existing environments | |
| | Feature | `grid_world` | `maze_env` | `wildfire_env` | **Pyre** | | |
| |---|---|---|---|---| | |
| | Observability | Full | Full | Partial | **Partial, first-person, text** | | |
| | Map dynamics | Static | Static | Dynamic (fire) | **Dynamic (fire + doors)** | | |
| | Action richness | 4 moves | 4 moves | Suppression | **Movement + door control + look** | | |
| | Agent role | Mover | Mover | Suppressor | **Survivor** | | |
| | Reward complexity | Reach goal | Reach goal | Suppress fire | **8-component composite rubric** | | |
| *`wildfire_env` trains an agent to fight fires from above; Pyre trains an agent to survive from inside.* | |
| --- | |
| ## What the agent sees (narrative observation) | |
| Every step the agent receives a first-person text observation: | |
| ``` | |
| You are in the **main_corridor**. The air is **moderate**. | |
| Health: ████████░░ (85/100) | Wind: **EAST** | |
| Flames are visible to the **west**. | |
| Exits visible: exit_0_7 at 8m west. | |
| Doors: door_1 (closed) at 2m east. | |
| You hear: Fire alarm sounding; Smoke detector beeping. | |
| Last action: You move south. The smoke is thick here. | |
| Available actions: move(direction='north') move(direction='south') door(target_id='door_1', door_state='open') look(direction='east') wait() | |
| ``` | |
| --- | |
| ## Action space | |
| | Action | Parameters | Effect | | |
| |---|---|---| | |
| | `move` | `direction` | Move one cell N/S/E/W | | |
| | `door` | `target_id`, `door_state` | Open or close a nearby door — closed doors slow fire spread to 15% | | |
| | `look` | `direction` | Scan up to 5 cells in one direction for detailed zone/fire/smoke info | | |
| | `wait` | — | Skip turn | | |
| --- | |
| ## Reward function (composite rubric) | |
| **Per step:** | |
| - `-0.01` constant time penalty | |
| - `+0.1` moved closer to nearest unblocked exit (BFS distance) | |
| - `-0.5` moved into smoke ≥ moderate or fire-adjacent cell | |
| - `-0.02 × damage` health drain from smoke/fire exposure | |
| - `+0.5` closed a door adjacent to active fire (strategic) | |
| **Episode end:** | |
| - `+5.0` agent evacuated alive | |
| - `-10.0` agent incapacitated | |
| - `+0.05 × remaining_steps` time bonus for fast evacuation | |
| --- | |
| ## Quick start | |
| ```bash | |
| cd pyre_env | |
| uv sync | |
| uv run server # → http://localhost:8000 | |
| # Health check | |
| curl http://localhost:8000/health | |
| # Reset (difficulty: easy | medium | hard) | |
| curl -X POST http://localhost:8000/reset \ | |
| -H "Content-Type: application/json" \ | |
| -d '{"difficulty": "medium"}' | |
| # Step | |
| curl -X POST http://localhost:8000/step \ | |
| -H "Content-Type: application/json" \ | |
| -d '{"action": "move", "direction": "north"}' | |
| # Random baseline (5 episodes) | |
| python examples/random_agent.py --episodes 5 --verbose | |
| ``` | |
| ### Python client | |
| ```python | |
| from pyre_env import PyreEnv, PyreAction | |
| with PyreEnv(base_url="http://localhost:8000") as env: | |
| result = env.reset() | |
| print(result.observation.narrative) | |
| result = env.step(PyreAction(action="move", direction="north")) | |
| print(f"Reward: {result.reward}, Health: {result.observation.agent_health}") | |
| ``` | |
| --- | |
| ## Difficulty levels | |
| | Level | Fire sources | Spread rate | Wind | Humidity | Max steps | | |
| |---|---|---|---|---|---| | |
| | `easy` | 1 | 10–20% | Calm | 30–50% | 200 | | |
| | `medium` | 2–4 | 15–40% | Any | 10–45% | 150 | | |
| | `hard` | 3–5 | 30–55% | Never calm | 5–20% | 100 | | |
| --- | |
| ## Deployment | |
| ```bash | |
| openenv push --repo-id your-org/pyre-env | |
| ``` | |
| --- | |
| ## Project structure | |
| ``` | |
| pyre_env/ | |
| ├── models.py PyreAction, PyreObservation, PyreMapState, PyreState | |
| ├── client.py PyreEnv (EnvClient subclass) | |
| ├── openenv.yaml OpenEnv manifest | |
| ├── pyproject.toml | |
| ├── server/ | |
| │ ├── app.py FastAPI bootstrap | |
| │ ├── pyre_env_environment.py Main Environment class | |
| │ ├── floor_plan.py 3 building templates + episode generation | |
| │ ├── fire_sim.py Cellular automaton fire/smoke simulation | |
| │ ├── narrative.py Visibility + first-person text observation renderer | |
| │ └── rubrics.py 8 composable reward components | |
| └── examples/ | |
| └── random_agent.py Smoke-test baseline | |
| ``` | |
| --- | |
| ## Hackathon alignment | |
| - **Theme #2 — Long-Horizon Planning**: 50–200 step episodes; agent must build a mental map across many partial observations | |
| - **Theme #3.1 — World Modeling**: no global map; agent infers fire spread, corridor topology, and exit reachability from local text observations alone | |