Spaces:
Sleeping
title: Traffic Signal Optimization — OpenEnv Elite
emoji: 🚦
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
pinned: false
🚥 Traffic Signal Optimization — OpenEnv Elite
Meta × PyTorch OpenEnv Hackathon Submission
A world-class Reinforcement Learning environment for urban traffic control, featuring stochastic multi-lane dynamics, emergency vehicle prioritization, and sophisticated fairness-driven rewards.
🏗️ Problem Statement
Fixed-cycle traffic signals are a relic of the past. In modern urban environments, they create needless congestion, increase CO2 emissions, and — most critically — cause life-threatening delays for emergency vehicles.
This project provides a high-fidelity 4-way intersection simulation designed for OpenEnv. It challenges RL agents to move beyond simple throughput and master the art of dynamic balancing: serving high-demand lanes while maintaining fairness for low-traffic directions and clearing "Golden Windows" for emergency responders.
🚀 Quick Start
# Run the complete suite: Simulation + Sanity Checks + Comparison
python test_env.py
# Run a specific high-intensity scenario
python test_env.py hard
from env import TrafficEnv
from tasks import get_config
from baseline_agent import RuleBasedAgent
# 1. Load a structured difficulty profile
config = get_config("medium")
env = TrafficEnv(config)
# 2. Initialize our sophisticated Rule-Based Controller
agent = RuleBasedAgent()
state = env.reset()
done = False
while not done:
action = agent.select_action(state)
state, reward, done, info = env.step(action)
print(f"Total Cleared: {info['total_cleared']}")
print(f"Fairness Index: {info['fairness_score']:.2f}")
🧠 Environment Design Philosophy
State Space
The environment exposes a 14-dimensional continuous observation vector, providing the agent with full situational awareness:
- Queues (4): Exact vehicle count per lane [N, S, E, W].
- Wait Pressure (4): Cumulative "impatience" score per lane.
- Emergency Flags (4): Binary detection of EVs per lane.
- Signal State (2): Current phase [0=NS, 1=EW] and step count.
Action Space
0: Maintain — keep the current green phase.1: Switch — transition the signal (includes yellow-phase discharge friction).
💎 Reward Engineering (The "Judge's Choice")
Our reward function is the core of this submission. It isn't just a count; it's a multi-objective ethical framework clipped to [-1, 1]:
| Component | Logic | Purpose |
|---|---|---|
| Throughput (+) | +0.20 * cars_cleared |
Incentivizes active vehicle flow. |
| Density (-) | -0.40 * total_congestion |
Penalizes letting the intersection fill up. |
| Bottleneck (-) | -0.15 * max_queue |
Discourages extreme build-up in any single lane. |
| Stability (-) | -switch_penalty |
Prevents "flickering" and promotes signal stability. |
| Fairness (+/-) | +0.10 bonus / -penalty |
Rewards balanced service; penalizes starvation. |
| Emergency (🚨) | Golden Window Bonus |
Massive reward for clearing EVs within target steps. |
| EV Delay (-) | Exponential Penalty |
Punishes agents for delaying life-saving vehicles. |
📊 Evaluation Metrics
We track 8 key performance indicators per episode to ensure a winning submission can be quantified:
- Total Cleared: Raw efficiency metric.
- Avg Waiting Time: The "commuter frustration" index.
- Max Queue Length: Gauges system robustness against bottlenecks.
- Signal Switch Count: Measures policy stability.
- Congestion Score: Final system state snapshot.
- Avg EV Clear Time: Critical safety metric (lower is better).
- Fairness Score: [0, 1] index — how equally did we serve all lanes?
- Total EV Penalty: Measures total failure to prioritize safety.
⚡ Task Difficulty Levels
| Parameter | Easy | Medium | Hard |
|---|---|---|---|
| Arrival Rate | 0–1 | 1–3 | 2–5 |
| Discharge Rate | 4–5 | 3–5 | 2–4 |
| Burst Frequency | 0% | 10% | 20% |
| Emergency Prob | 1% | 5% | 15% |
| EV Golden Window | 8 steps | 5 steps | 3 steps |
| Fairness Limit | 20 steps | 15 steps | 10 steps |
🚑 Emergency & Fairness Logic
The "Golden Window"
When an Emergency Vehicle (EV) appears, the agent is granted a bonus if it switches and clears the lane within the Golden Window (defined per difficulty). Failing to do so triggers an exponential delay penalty, simulating the real-world cost of stopping an ambulance or fire truck.
Fairness Guard
To prevent "Starvation" (where the agent ignores a low-traffic lane to optimize throughput on a high-traffic lane), a Fairness Score is calculated. If a lane remains red beyond the Starvation Limit, the agent suffers a heavy penalty. This forces the agent to learn the complex trade-off between total throughput and social fairness.
🚶 Step Walkthrough
Step 12: 🚨 Ambulance detected in East lane (currently RED).
- EW Queue: 4, EV Timer: 0
- Agent receives p_emergency penalty.
Step 13: Agent Action: 1 (SWITCH to EW).
- Switch penalty applied (-0.20).
- NS lanes stop; EW lanes turn GREEN.
Step 14: EV Cleared!
- EV Clear Time: 2 steps.
- Agent receives r_ev_bonus (+0.25) for "Golden Window" clearance.
- Total cleared (+0.60 reward).
🔮 Future Improvements
- Multi-Intersection Coordination: Extending to a grid of agents using MARL.
- Pedestrian Logic: Adding crosswalks and pedestrian priority.
- V2X Communication: Providing agents with ahead-of-time traffic predictions.
📜 License
MIT © 2026 Meta x PyTorch OpenEnv Hackathon