File size: 10,577 Bytes
378e8a1
6fac95b
378e8a1
088d017
 
6fac95b
 
429558a
378e8a1
 
6fac95b
088d017
6fac95b
088d017
429558a
088d017
6fac95b
088d017
6fac95b
088d017
6fac95b
088d017
6fac95b
088d017
6fac95b
 
 
 
 
 
088d017
6fac95b
088d017
6fac95b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
088d017
 
6fac95b
 
 
 
088d017
6fac95b
088d017
6fac95b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
088d017
6fac95b
088d017
6fac95b
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
---
title: sumo_rl_env Environment
sdk: docker
app_port: 8000
base_path: /web
tags:
  - openenv
  - openenv-0.2.3
---

# sumo_rl_env Environment

Space URL: `https://huggingface.co/spaces/openenv/sumo_rl_env`

OpenEnv pinned ref: `0.2.3`

# SUMO-RL Environment

Integration of traffic signal control with the OpenEnv framework via SUMO (Simulation of Urban MObility) and SUMO-RL.

## Overview

This environment enables reinforcement learning for **traffic signal control** using SUMO, a microscopic traffic simulation package. Train RL agents to optimize traffic light timing and minimize vehicle delays.

**Key Features**:
- **Realistic traffic simulation** via SUMO
- **Single-agent mode** for single intersection control
- **Configurable rewards** (waiting time, queue, pressure, speed)
- **Multiple networks** supported (custom .net.xml and .rou.xml files)
- **Docker-ready** with pre-bundled example network

## Quick Start

### Using Docker (Recommended)

```python
from envs.sumo_rl_env import SumoRLEnv, SumoAction

# Automatically starts container
env = SumoRLEnv.from_docker_image("sumo-rl-env:latest")

# Reset environment
result = env.reset()
print(f"Observation shape: {result.observation.observation_shape}")
print(f"Available actions: {result.observation.action_mask}")

# Take action (select next green phase)
result = env.step(SumoAction(phase_id=1))
print(f"Reward: {result.reward}, Done: {result.done}")

# Get state
state = env.state()
print(f"Simulation time: {state.sim_time}")
print(f"Total vehicles: {state.total_vehicles}")
print(f"Mean waiting time: {state.mean_waiting_time}")

# Cleanup
env.close()
```

### Building the Docker Image

```bash
cd OpenEnv

# Build base image first (if not already built)
docker build -t envtorch-base:latest -f src/openenv/core/containers/images/Dockerfile .

# Build SUMO-RL environment
docker build -f envs/sumo_rl_env/server/Dockerfile -t sumo-rl-env:latest .
```

### Running with Different Configurations

```bash
# Default: single-intersection
docker run -p 8000:8000 sumo-rl-env:latest

# Longer simulation
docker run -p 8000:8000 \
  -e SUMO_NUM_SECONDS=50000 \
  sumo-rl-env:latest

# Different reward function
docker run -p 8000:8000 \
  -e SUMO_REWARD_FN=queue \
  sumo-rl-env:latest

# Custom seed for reproducibility
docker run -p 8000:8000 \
  -e SUMO_SEED=123 \
  sumo-rl-env:latest
```

## Observation

The observation is a vector containing:
- **Phase one-hot**: Current active green phase (one-hot encoded)
- **Min green flag**: Binary indicator if minimum green time has passed
- **Lane densities**: Number of vehicles / lane capacity for each incoming lane
- **Lane queues**: Number of queued vehicles / lane capacity for each incoming lane

Observation size varies by network topology (depends on number of phases and lanes).

**Default (single-intersection)**:
- 4 green phases
- 8 incoming lanes
- Observation size: ~21 elements

## Action Space

The action space is discrete and represents selecting the next green phase to activate.

- **Action type**: Discrete
- **Action range**: `[0, num_green_phases - 1]`
- **Default (single-intersection)**: 4 actions (one per green phase)

When a phase change is requested, SUMO automatically inserts a yellow phase before switching.

## Rewards

Default reward function is **change in cumulative waiting time**:
```
reward = -(total_waiting_time_now - total_waiting_time_previous)
```

Positive rewards indicate waiting time decreased (good).

### Available Reward Functions

Set via `SUMO_REWARD_FN` environment variable:

- **`diff-waiting-time`** (default): Change in cumulative waiting time
- **`average-speed`**: Average speed of all vehicles
- **`queue`**: Negative total queue length
- **`pressure`**: Pressure metric (incoming - outgoing vehicles)

## Configuration

### Environment Variables

| Variable | Default | Description |
|----------|---------|-------------|
| `SUMO_NET_FILE` | `/app/nets/single-intersection.net.xml` | Network topology file |
| `SUMO_ROUTE_FILE` | `/app/nets/single-intersection.rou.xml` | Vehicle routes file |
| `SUMO_NUM_SECONDS` | `20000` | Simulation duration (seconds) |
| `SUMO_DELTA_TIME` | `5` | Seconds between agent actions |
| `SUMO_YELLOW_TIME` | `2` | Yellow phase duration (seconds) |
| `SUMO_MIN_GREEN` | `5` | Minimum green time (seconds) |
| `SUMO_MAX_GREEN` | `50` | Maximum green time (seconds) |
| `SUMO_REWARD_FN` | `diff-waiting-time` | Reward function name |
| `SUMO_SEED` | `42` | Random seed (use for reproducibility) |

### Using Custom Networks

To use your own SUMO network:

```python
from envs.sumo_rl_env import SumoRLEnv

env = SumoRLEnv.from_docker_image(
    "sumo-rl-env:latest",
    volumes={
        "/path/to/your/nets": {"bind": "/nets", "mode": "ro"}
    },
    environment={
        "SUMO_NET_FILE": "/nets/my-network.net.xml",
        "SUMO_ROUTE_FILE": "/nets/my-routes.rou.xml",
    }
)
```

Your network directory should contain:
- `.net.xml` - Network topology (roads, junctions, traffic lights)
- `.rou.xml` - Vehicle routes (trip definitions, flow rates)

## API Reference

### SumoAction

```python
@dataclass
class SumoAction(Action):
    phase_id: int  # Green phase to activate (0 to num_phases-1)
    ts_id: str = "0"  # Traffic signal ID (for multi-agent)
```

### SumoObservation

```python
@dataclass
class SumoObservation(Observation):
    observation: List[float]  # Observation vector
    observation_shape: List[int]  # Shape for reshaping
    action_mask: List[int]  # Valid action indices
    sim_time: float  # Current simulation time
    done: bool  # Episode finished
    reward: Optional[float]  # Reward from last action
    metadata: Dict  # System metrics
```

### SumoState

```python
@dataclass
class SumoState(State):
    episode_id: str  # Unique episode ID
    step_count: int  # Steps taken
    net_file: str  # Network file path
    route_file: str  # Route file path
    sim_time: float  # Current simulation time
    total_vehicles: int  # Total vehicles in simulation
    total_waiting_time: float  # Cumulative waiting time
    mean_waiting_time: float  # Mean waiting time
    mean_speed: float  # Mean vehicle speed
    # ... configuration parameters
```

## Example Training Loop

```python
from envs.sumo_rl_env import SumoRLEnv, SumoAction
import numpy as np

# Start environment
env = SumoRLEnv.from_docker_image("sumo-rl-env:latest")

# Training loop
for episode in range(10):
    result = env.reset()
    episode_reward = 0
    steps = 0

    while not result.done and steps < 1000:
        # Random policy (replace with your RL agent)
        action_id = np.random.choice(result.observation.action_mask)

        # Take action
        result = env.step(SumoAction(phase_id=int(action_id)))

        episode_reward += result.reward or 0
        steps += 1

        # Print progress every 100 steps
        if steps % 100 == 0:
            state = env.state()
            print(f"Step {steps}: "
                  f"reward={result.reward:.2f}, "
                  f"vehicles={state.total_vehicles}, "
                  f"waiting={state.mean_waiting_time:.2f}")

    print(f"Episode {episode}: total_reward={episode_reward:.2f}, steps={steps}")

env.close()
```

## Performance Notes

### Simulation Speed

- **Reset time**: 1-5 seconds (starts new SUMO simulation)
- **Step time**: ~50-200ms per step (depends on network size)
- **Episode duration**: Minutes (20,000 sim seconds with delta_time=5 β†’ ~4,000 steps)

### Optimization

For faster simulation:
1. Reduce `SUMO_NUM_SECONDS` for shorter episodes
2. Increase `SUMO_DELTA_TIME` for fewer decisions
3. Use simpler networks with fewer vehicles

## Architecture

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Client: SumoRLEnv               β”‚
β”‚  .step(phase_id=1)              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚ HTTP
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ FastAPI Server (Docker)         β”‚
β”‚   SumoEnvironment               β”‚
β”‚     β”œβ”€ Wraps sumo_rl           β”‚
β”‚     β”œβ”€ Single-agent mode       β”‚
β”‚     └─ No GUI                  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
               β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ SUMO Simulator                  β”‚
β”‚  - Reads .net.xml (network)     β”‚
β”‚  - Reads .rou.xml (routes)      β”‚
β”‚  - Simulates traffic flow       β”‚
β”‚  - Provides observations        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

## Bundled Network

The default `single-intersection` network is a simple 4-way intersection with:
- **4 incoming roads** (North, South, East, West)
- **4 green phases** (NS straight, NS left, EW straight, EW left)
- **Vehicle flow**: Continuous stream with varying rates

## Limitations

- **No GUI in Docker**: SUMO GUI requires X server (not available in containers)
- **Single-agent only**: Multi-agent (multiple intersections) coming in future version
- **Fixed network per container**: Each container uses one network topology
- **Memory usage**: ~500MB for small networks, 2-4GB for large city networks

## Troubleshooting

### Container won't start
```bash
# Check logs
docker logs <container-id>

# Verify network files exist
docker run sumo-rl-env:latest ls -la /app/nets/
```

### "SUMO_HOME not set" error
This should be automatic in Docker. If running locally:
```bash
export SUMO_HOME=/usr/share/sumo
```

### Slow performance
- Reduce simulation duration: `SUMO_NUM_SECONDS=5000`
- Increase action interval: `SUMO_DELTA_TIME=10`
- Use smaller networks with fewer vehicles

## References

- [SUMO Documentation](https://sumo.dlr.de/docs/)
- [SUMO-RL GitHub](https://github.com/LucasAlegre/sumo-rl)
- [SUMO-RL Paper](https://peerj.com/articles/cs-575/)
- [RESCO Benchmarks](https://github.com/jault/RESCO)

## Citation

If you use SUMO-RL in your research, please cite:

```bibtex
@misc{sumorl,
    author = {Lucas N. Alegre},
    title = {{SUMO-RL}},
    year = {2019},
    publisher = {GitHub},
    journal = {GitHub repository},
    howpublished = {\url{https://github.com/LucasAlegre/sumo-rl}},
}
```

## License

This integration is licensed under the BSD-style license. SUMO-RL and SUMO have their own licenses.