--- title: Traffic Signal Optimization — OpenEnv Elite emoji: 🚦 colorFrom: blue colorTo: green sdk: docker app_port: 7860 pinned: false --- # 🚥 Traffic Signal Optimization — OpenEnv Elite > **Meta × PyTorch OpenEnv Hackathon Submission** > > A world-class Reinforcement Learning environment for urban traffic control, featuring stochastic multi-lane dynamics, emergency vehicle prioritization, and sophisticated fairness-driven rewards. --- ## 🏗️ Problem Statement Fixed-cycle traffic signals are a relic of the past. In modern urban environments, they create **needless congestion**, increase **CO2 emissions**, and — most critically — cause **life-threatening delays** for emergency vehicles. This project provides a high-fidelity 4-way intersection simulation designed for OpenEnv. It challenges RL agents to move beyond simple throughput and master the art of **dynamic balancing**: serving high-demand lanes while maintaining fairness for low-traffic directions and clearing "Golden Windows" for emergency responders. --- ## 🚀 Quick Start ```bash # Run the complete suite: Simulation + Sanity Checks + Comparison python test_env.py # Run a specific high-intensity scenario python test_env.py hard ``` ```python from env import TrafficEnv from tasks import get_config from baseline_agent import RuleBasedAgent # 1. Load a structured difficulty profile config = get_config("medium") env = TrafficEnv(config) # 2. Initialize our sophisticated Rule-Based Controller agent = RuleBasedAgent() state = env.reset() done = False while not done: action = agent.select_action(state) state, reward, done, info = env.step(action) print(f"Total Cleared: {info['total_cleared']}") print(f"Fairness Index: {info['fairness_score']:.2f}") ``` --- ## 🧠 Environment Design Philosophy ### State Space The environment exposes a **14-dimensional** continuous observation vector, providing the agent with full situational awareness: - **Queues (4)**: Exact vehicle count per lane [N, S, E, W]. - **Wait Pressure (4)**: Cumulative "impatience" score per lane. - **Emergency Flags (4)**: Binary detection of EVs per lane. - **Signal State (2)**: Current phase [0=NS, 1=EW] and step count. ### Action Space - `0`: **Maintain** — keep the current green phase. - `1`: **Switch** — transition the signal (includes yellow-phase discharge friction). --- ## 💎 Reward Engineering (The "Judge's Choice") Our reward function is the core of this submission. It isn't just a count; it's a **multi-objective ethical framework** clipped to `[-1, 1]`: | Component | Logic | Purpose | | :--- | :--- | :--- | | **Throughput (+)** | `+0.20 * cars_cleared` | Incentivizes active vehicle flow. | | **Density (-)** | `-0.40 * total_congestion` | Penalizes letting the intersection fill up. | | **Bottleneck (-)** | `-0.15 * max_queue` | Discourages extreme build-up in any single lane. | | **Stability (-)** | `-switch_penalty` | Prevents "flickering" and promotes signal stability. | | **Fairness (+/-)** | `+0.10` bonus / `-penalty` | Rewards balanced service; penalizes starvation. | | **Emergency (🚨)** | `Golden Window` Bonus | Massive reward for clearing EVs within target steps. | | **EV Delay (-)** | `Exponential Penalty` | Punishes agents for delaying life-saving vehicles. | --- ## 📊 Evaluation Metrics We track **8 key performance indicators** per episode to ensure a winning submission can be quantified: 1. **Total Cleared**: Raw efficiency metric. 2. **Avg Waiting Time**: The "commuter frustration" index. 3. **Max Queue Length**: Gauges system robustness against bottlenecks. 4. **Signal Switch Count**: Measures policy stability. 5. **Congestion Score**: Final system state snapshot. 6. **Avg EV Clear Time**: Critical safety metric (lower is better). 7. **Fairness Score**: [0, 1] index — how equally did we serve all lanes? 8. **Total EV Penalty**: Measures total failure to prioritize safety. --- ## ⚡ Task Difficulty Levels | Parameter | Easy | Medium | Hard | | :--- | :--- | :--- | :--- | | **Arrival Rate** | 0–1 | 1–3 | 2–5 | | **Discharge Rate** | 4–5 | 3–5 | 2–4 | | **Burst Frequency** | 0% | 10% | 20% | | **Emergency Prob** | 1% | 5% | 15% | | **EV Golden Window** | 8 steps | 5 steps | 3 steps | | **Fairness Limit** | 20 steps | 15 steps | 10 steps | --- ## 🚑 Emergency & Fairness Logic ### The "Golden Window" When an Emergency Vehicle (EV) appears, the agent is granted a bonus if it switches and clears the lane within the **Golden Window** (defined per difficulty). Failing to do so triggers an **exponential delay penalty**, simulating the real-world cost of stopping an ambulance or fire truck. ### Fairness Guard To prevent "Starvation" (where the agent ignores a low-traffic lane to optimize throughput on a high-traffic lane), a **Fairness Score** is calculated. If a lane remains red beyond the **Starvation Limit**, the agent suffers a heavy penalty. This forces the agent to learn the complex trade-off between total throughput and social fairness. --- ## 🚶 Step Walkthrough ```text Step 12: 🚨 Ambulance detected in East lane (currently RED). - EW Queue: 4, EV Timer: 0 - Agent receives p_emergency penalty. Step 13: Agent Action: 1 (SWITCH to EW). - Switch penalty applied (-0.20). - NS lanes stop; EW lanes turn GREEN. Step 14: EV Cleared! - EV Clear Time: 2 steps. - Agent receives r_ev_bonus (+0.25) for "Golden Window" clearance. - Total cleared (+0.60 reward). ``` --- ## 🔮 Future Improvements - **Multi-Intersection Coordination**: Extending to a grid of agents using MARL. - **Pedestrian Logic**: Adding crosswalks and pedestrian priority. - **V2X Communication**: Providing agents with ahead-of-time traffic predictions. --- ## 📜 License MIT © 2026 Meta x PyTorch OpenEnv Hackathon