Spaces:
Sleeping
Sleeping
File size: 5,258 Bytes
443c22e | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 | ---
title: Pyre — Crisis Navigation Environment
emoji: 🔥
colorFrom: red
colorTo: yellow
sdk: docker
pinned: false
app_port: 8000
tags:
- openenv
---
# Pyre — Crisis Navigation Environment for LLM Agents
> *When buildings burn, the difference between a safe evacuation and a tragedy is the quality of decisions made in the first 60 seconds. Can we train an LLM to make them?*
**Pyre** places an LLM agent *inside* a burning building. The agent must navigate to safety under partial observability — no global map, hard time pressure, and a fire that actively spreads and blocks exits.
---
## Why Pyre vs. existing environments
| Feature | `grid_world` | `maze_env` | `wildfire_env` | **Pyre** |
|---|---|---|---|---|
| Observability | Full | Full | Partial | **Partial, first-person, text** |
| Map dynamics | Static | Static | Dynamic (fire) | **Dynamic (fire + doors)** |
| Action richness | 4 moves | 4 moves | Suppression | **Movement + door control + look** |
| Agent role | Mover | Mover | Suppressor | **Survivor** |
| Reward complexity | Reach goal | Reach goal | Suppress fire | **8-component composite rubric** |
*`wildfire_env` trains an agent to fight fires from above; Pyre trains an agent to survive from inside.*
---
## What the agent sees (narrative observation)
Every step the agent receives a first-person text observation:
```
You are in the **main_corridor**. The air is **moderate**.
Health: ████████░░ (85/100) | Wind: **EAST**
Flames are visible to the **west**.
Exits visible: exit_0_7 at 8m west.
Doors: door_1 (closed) at 2m east.
You hear: Fire alarm sounding; Smoke detector beeping.
Last action: You move south. The smoke is thick here.
Available actions: move(direction='north') move(direction='south') door(target_id='door_1', door_state='open') look(direction='east') wait()
```
---
## Action space
| Action | Parameters | Effect |
|---|---|---|
| `move` | `direction` | Move one cell N/S/E/W |
| `door` | `target_id`, `door_state` | Open or close a nearby door — closed doors slow fire spread to 15% |
| `look` | `direction` | Scan up to 5 cells in one direction for detailed zone/fire/smoke info |
| `wait` | — | Skip turn |
---
## Reward function (composite rubric)
**Per step:**
- `-0.01` constant time penalty
- `+0.1` moved closer to nearest unblocked exit (BFS distance)
- `-0.5` moved into smoke ≥ moderate or fire-adjacent cell
- `-0.02 × damage` health drain from smoke/fire exposure
- `+0.5` closed a door adjacent to active fire (strategic)
**Episode end:**
- `+5.0` agent evacuated alive
- `-10.0` agent incapacitated
- `+0.05 × remaining_steps` time bonus for fast evacuation
---
## Quick start
```bash
cd pyre_env
uv sync
uv run server # → http://localhost:8000
# Health check
curl http://localhost:8000/health
# Reset (difficulty: easy | medium | hard)
curl -X POST http://localhost:8000/reset \
-H "Content-Type: application/json" \
-d '{"difficulty": "medium"}'
# Step
curl -X POST http://localhost:8000/step \
-H "Content-Type: application/json" \
-d '{"action": "move", "direction": "north"}'
# Random baseline (5 episodes)
python examples/random_agent.py --episodes 5 --verbose
```
### Python client
```python
from pyre_env import PyreEnv, PyreAction
with PyreEnv(base_url="http://localhost:8000") as env:
result = env.reset()
print(result.observation.narrative)
result = env.step(PyreAction(action="move", direction="north"))
print(f"Reward: {result.reward}, Health: {result.observation.agent_health}")
```
---
## Difficulty levels
| Level | Fire sources | Spread rate | Wind | Humidity | Max steps |
|---|---|---|---|---|---|
| `easy` | 1 | 10–20% | Calm | 30–50% | 200 |
| `medium` | 2–4 | 15–40% | Any | 10–45% | 150 |
| `hard` | 3–5 | 30–55% | Never calm | 5–20% | 100 |
---
## Deployment
```bash
openenv push --repo-id your-org/pyre-env
```
---
## Project structure
```
pyre_env/
├── models.py PyreAction, PyreObservation, PyreMapState, PyreState
├── client.py PyreEnv (EnvClient subclass)
├── openenv.yaml OpenEnv manifest
├── pyproject.toml
├── server/
│ ├── app.py FastAPI bootstrap
│ ├── pyre_env_environment.py Main Environment class
│ ├── floor_plan.py 3 building templates + episode generation
│ ├── fire_sim.py Cellular automaton fire/smoke simulation
│ ├── narrative.py Visibility + first-person text observation renderer
│ └── rubrics.py 8 composable reward components
└── examples/
└── random_agent.py Smoke-test baseline
```
---
## Hackathon alignment
- **Theme #2 — Long-Horizon Planning**: 50–200 step episodes; agent must build a mental map across many partial observations
- **Theme #3.1 — World Modeling**: no global map; agent infers fire spread, corridor topology, and exit reachability from local text observations alone
|