Spaces:

arrow072
/

open_env_meta

Sleeping

App Files Files Community

open_env_meta / README.md

arrow072

Upload 14 files

5516cba verified about 1 month ago

preview code

raw

history blame contribute delete

5.91 kB

metadata

title: Traffic Signal Optimization — OpenEnv Elite
emoji: 🚦
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
pinned: false

🚥 Traffic Signal Optimization — OpenEnv Elite

Meta × PyTorch OpenEnv Hackathon Submission

A world-class Reinforcement Learning environment for urban traffic control, featuring stochastic multi-lane dynamics, emergency vehicle prioritization, and sophisticated fairness-driven rewards.

🏗️ Problem Statement

Fixed-cycle traffic signals are a relic of the past. In modern urban environments, they create needless congestion, increase CO2 emissions, and — most critically — cause life-threatening delays for emergency vehicles.

This project provides a high-fidelity 4-way intersection simulation designed for OpenEnv. It challenges RL agents to move beyond simple throughput and master the art of dynamic balancing: serving high-demand lanes while maintaining fairness for low-traffic directions and clearing "Golden Windows" for emergency responders.

🚀 Quick Start

# Run the complete suite: Simulation + Sanity Checks + Comparison
python test_env.py

# Run a specific high-intensity scenario
python test_env.py hard

from env import TrafficEnv
from tasks import get_config
from baseline_agent import RuleBasedAgent

# 1. Load a structured difficulty profile
config = get_config("medium")
env    = TrafficEnv(config)

# 2. Initialize our sophisticated Rule-Based Controller
agent  = RuleBasedAgent()

state = env.reset()
done  = False

while not done:
    action = agent.select_action(state)
    state, reward, done, info = env.step(action)

print(f"Total Cleared: {info['total_cleared']}")
print(f"Fairness Index: {info['fairness_score']:.2f}")

🧠 Environment Design Philosophy

State Space

The environment exposes a 14-dimensional continuous observation vector, providing the agent with full situational awareness:

Queues (4): Exact vehicle count per lane [N, S, E, W].
Wait Pressure (4): Cumulative "impatience" score per lane.
Emergency Flags (4): Binary detection of EVs per lane.
Signal State (2): Current phase [0=NS, 1=EW] and step count.

Action Space

0: Maintain — keep the current green phase.
1: Switch — transition the signal (includes yellow-phase discharge friction).

💎 Reward Engineering (The "Judge's Choice")

Our reward function is the core of this submission. It isn't just a count; it's a multi-objective ethical framework clipped to [-1, 1]:

Component	Logic	Purpose
Throughput (+)	`+0.20 * cars_cleared`	Incentivizes active vehicle flow.
Density (-)	`-0.40 * total_congestion`	Penalizes letting the intersection fill up.
Bottleneck (-)	`-0.15 * max_queue`	Discourages extreme build-up in any single lane.
Stability (-)	`-switch_penalty`	Prevents "flickering" and promotes signal stability.
Fairness (+/-)	`+0.10` bonus / `-penalty`	Rewards balanced service; penalizes starvation.
Emergency (🚨)	`Golden Window` Bonus	Massive reward for clearing EVs within target steps.
EV Delay (-)	`Exponential Penalty`	Punishes agents for delaying life-saving vehicles.

📊 Evaluation Metrics

We track 8 key performance indicators per episode to ensure a winning submission can be quantified:

Total Cleared: Raw efficiency metric.
Avg Waiting Time: The "commuter frustration" index.
Max Queue Length: Gauges system robustness against bottlenecks.
Signal Switch Count: Measures policy stability.
Congestion Score: Final system state snapshot.
Avg EV Clear Time: Critical safety metric (lower is better).
Fairness Score: [0, 1] index — how equally did we serve all lanes?
Total EV Penalty: Measures total failure to prioritize safety.

⚡ Task Difficulty Levels

Parameter	Easy	Medium	Hard
Arrival Rate	0–1	1–3	2–5
Discharge Rate	4–5	3–5	2–4
Burst Frequency	0%	10%	20%
Emergency Prob	1%	5%	15%
EV Golden Window	8 steps	5 steps	3 steps
Fairness Limit	20 steps	15 steps	10 steps

🚑 Emergency & Fairness Logic

The "Golden Window"

When an Emergency Vehicle (EV) appears, the agent is granted a bonus if it switches and clears the lane within the Golden Window (defined per difficulty). Failing to do so triggers an exponential delay penalty, simulating the real-world cost of stopping an ambulance or fire truck.

Fairness Guard

To prevent "Starvation" (where the agent ignores a low-traffic lane to optimize throughput on a high-traffic lane), a Fairness Score is calculated. If a lane remains red beyond the Starvation Limit, the agent suffers a heavy penalty. This forces the agent to learn the complex trade-off between total throughput and social fairness.

🚶 Step Walkthrough

Step 12:  🚨 Ambulance detected in East lane (currently RED).
          - EW Queue: 4, EV Timer: 0
          - Agent receives p_emergency penalty.

Step 13:  Agent Action: 1 (SWITCH to EW).
          - Switch penalty applied (-0.20).
          - NS lanes stop; EW lanes turn GREEN.

Step 14:  EV Cleared!
          - EV Clear Time: 2 steps.
          - Agent receives r_ev_bonus (+0.25) for "Golden Window" clearance.
          - Total cleared (+0.60 reward).

🔮 Future Improvements

Multi-Intersection Coordination: Extending to a grid of agents using MARL.
Pedestrian Logic: Adding crosswalks and pedestrian priority.
V2X Communication: Providing agents with ahead-of-time traffic predictions.