Spaces:
Sleeping
Sleeping
File size: 5,905 Bytes
80f43df 5516cba 80f43df 5516cba 80f43df 5516cba | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 | ---
title: Traffic Signal Optimization — OpenEnv Elite
emoji: 🚦
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
pinned: false
---
# 🚥 Traffic Signal Optimization — OpenEnv Elite
> **Meta × PyTorch OpenEnv Hackathon Submission**
>
> A world-class Reinforcement Learning environment for urban traffic control, featuring stochastic multi-lane dynamics, emergency vehicle prioritization, and sophisticated fairness-driven rewards.
---
## 🏗️ Problem Statement
Fixed-cycle traffic signals are a relic of the past. In modern urban environments, they create **needless congestion**, increase **CO2 emissions**, and — most critically — cause **life-threatening delays** for emergency vehicles.
This project provides a high-fidelity 4-way intersection simulation designed for OpenEnv. It challenges RL agents to move beyond simple throughput and master the art of **dynamic balancing**: serving high-demand lanes while maintaining fairness for low-traffic directions and clearing "Golden Windows" for emergency responders.
---
## 🚀 Quick Start
```bash
# Run the complete suite: Simulation + Sanity Checks + Comparison
python test_env.py
# Run a specific high-intensity scenario
python test_env.py hard
```
```python
from env import TrafficEnv
from tasks import get_config
from baseline_agent import RuleBasedAgent
# 1. Load a structured difficulty profile
config = get_config("medium")
env = TrafficEnv(config)
# 2. Initialize our sophisticated Rule-Based Controller
agent = RuleBasedAgent()
state = env.reset()
done = False
while not done:
action = agent.select_action(state)
state, reward, done, info = env.step(action)
print(f"Total Cleared: {info['total_cleared']}")
print(f"Fairness Index: {info['fairness_score']:.2f}")
```
---
## 🧠 Environment Design Philosophy
### State Space
The environment exposes a **14-dimensional** continuous observation vector, providing the agent with full situational awareness:
- **Queues (4)**: Exact vehicle count per lane [N, S, E, W].
- **Wait Pressure (4)**: Cumulative "impatience" score per lane.
- **Emergency Flags (4)**: Binary detection of EVs per lane.
- **Signal State (2)**: Current phase [0=NS, 1=EW] and step count.
### Action Space
- `0`: **Maintain** — keep the current green phase.
- `1`: **Switch** — transition the signal (includes yellow-phase discharge friction).
---
## 💎 Reward Engineering (The "Judge's Choice")
Our reward function is the core of this submission. It isn't just a count; it's a **multi-objective ethical framework** clipped to `[-1, 1]`:
| Component | Logic | Purpose |
| :--- | :--- | :--- |
| **Throughput (+)** | `+0.20 * cars_cleared` | Incentivizes active vehicle flow. |
| **Density (-)** | `-0.40 * total_congestion` | Penalizes letting the intersection fill up. |
| **Bottleneck (-)** | `-0.15 * max_queue` | Discourages extreme build-up in any single lane. |
| **Stability (-)** | `-switch_penalty` | Prevents "flickering" and promotes signal stability. |
| **Fairness (+/-)** | `+0.10` bonus / `-penalty` | Rewards balanced service; penalizes starvation. |
| **Emergency (🚨)** | `Golden Window` Bonus | Massive reward for clearing EVs within target steps. |
| **EV Delay (-)** | `Exponential Penalty` | Punishes agents for delaying life-saving vehicles. |
---
## 📊 Evaluation Metrics
We track **8 key performance indicators** per episode to ensure a winning submission can be quantified:
1. **Total Cleared**: Raw efficiency metric.
2. **Avg Waiting Time**: The "commuter frustration" index.
3. **Max Queue Length**: Gauges system robustness against bottlenecks.
4. **Signal Switch Count**: Measures policy stability.
5. **Congestion Score**: Final system state snapshot.
6. **Avg EV Clear Time**: Critical safety metric (lower is better).
7. **Fairness Score**: [0, 1] index — how equally did we serve all lanes?
8. **Total EV Penalty**: Measures total failure to prioritize safety.
---
## ⚡ Task Difficulty Levels
| Parameter | Easy | Medium | Hard |
| :--- | :--- | :--- | :--- |
| **Arrival Rate** | 0–1 | 1–3 | 2–5 |
| **Discharge Rate** | 4–5 | 3–5 | 2–4 |
| **Burst Frequency** | 0% | 10% | 20% |
| **Emergency Prob** | 1% | 5% | 15% |
| **EV Golden Window** | 8 steps | 5 steps | 3 steps |
| **Fairness Limit** | 20 steps | 15 steps | 10 steps |
---
## 🚑 Emergency & Fairness Logic
### The "Golden Window"
When an Emergency Vehicle (EV) appears, the agent is granted a bonus if it switches and clears the lane within the **Golden Window** (defined per difficulty). Failing to do so triggers an **exponential delay penalty**, simulating the real-world cost of stopping an ambulance or fire truck.
### Fairness Guard
To prevent "Starvation" (where the agent ignores a low-traffic lane to optimize throughput on a high-traffic lane), a **Fairness Score** is calculated. If a lane remains red beyond the **Starvation Limit**, the agent suffers a heavy penalty. This forces the agent to learn the complex trade-off between total throughput and social fairness.
---
## 🚶 Step Walkthrough
```text
Step 12: 🚨 Ambulance detected in East lane (currently RED).
- EW Queue: 4, EV Timer: 0
- Agent receives p_emergency penalty.
Step 13: Agent Action: 1 (SWITCH to EW).
- Switch penalty applied (-0.20).
- NS lanes stop; EW lanes turn GREEN.
Step 14: EV Cleared!
- EV Clear Time: 2 steps.
- Agent receives r_ev_bonus (+0.25) for "Golden Window" clearance.
- Total cleared (+0.60 reward).
```
---
## 🔮 Future Improvements
- **Multi-Intersection Coordination**: Extending to a grid of agents using MARL.
- **Pedestrian Logic**: Adding crosswalks and pedestrian priority.
- **V2X Communication**: Providing agents with ahead-of-time traffic predictions.
---
## 📜 License
MIT © 2026 Meta x PyTorch OpenEnv Hackathon
|