Spaces:
Sleeping
Sleeping
| title: Traffic Signal Optimization — OpenEnv Elite | |
| emoji: 🚦 | |
| colorFrom: blue | |
| colorTo: green | |
| sdk: docker | |
| app_port: 7860 | |
| pinned: false | |
| # 🚥 Traffic Signal Optimization — OpenEnv Elite | |
| > **Meta × PyTorch OpenEnv Hackathon Submission** | |
| > | |
| > A world-class Reinforcement Learning environment for urban traffic control, featuring stochastic multi-lane dynamics, emergency vehicle prioritization, and sophisticated fairness-driven rewards. | |
| --- | |
| ## 🏗️ Problem Statement | |
| Fixed-cycle traffic signals are a relic of the past. In modern urban environments, they create **needless congestion**, increase **CO2 emissions**, and — most critically — cause **life-threatening delays** for emergency vehicles. | |
| This project provides a high-fidelity 4-way intersection simulation designed for OpenEnv. It challenges RL agents to move beyond simple throughput and master the art of **dynamic balancing**: serving high-demand lanes while maintaining fairness for low-traffic directions and clearing "Golden Windows" for emergency responders. | |
| --- | |
| ## 🚀 Quick Start | |
| ```bash | |
| # Run the complete suite: Simulation + Sanity Checks + Comparison | |
| python test_env.py | |
| # Run a specific high-intensity scenario | |
| python test_env.py hard | |
| ``` | |
| ```python | |
| from env import TrafficEnv | |
| from tasks import get_config | |
| from baseline_agent import RuleBasedAgent | |
| # 1. Load a structured difficulty profile | |
| config = get_config("medium") | |
| env = TrafficEnv(config) | |
| # 2. Initialize our sophisticated Rule-Based Controller | |
| agent = RuleBasedAgent() | |
| state = env.reset() | |
| done = False | |
| while not done: | |
| action = agent.select_action(state) | |
| state, reward, done, info = env.step(action) | |
| print(f"Total Cleared: {info['total_cleared']}") | |
| print(f"Fairness Index: {info['fairness_score']:.2f}") | |
| ``` | |
| --- | |
| ## 🧠 Environment Design Philosophy | |
| ### State Space | |
| The environment exposes a **14-dimensional** continuous observation vector, providing the agent with full situational awareness: | |
| - **Queues (4)**: Exact vehicle count per lane [N, S, E, W]. | |
| - **Wait Pressure (4)**: Cumulative "impatience" score per lane. | |
| - **Emergency Flags (4)**: Binary detection of EVs per lane. | |
| - **Signal State (2)**: Current phase [0=NS, 1=EW] and step count. | |
| ### Action Space | |
| - `0`: **Maintain** — keep the current green phase. | |
| - `1`: **Switch** — transition the signal (includes yellow-phase discharge friction). | |
| --- | |
| ## 💎 Reward Engineering (The "Judge's Choice") | |
| Our reward function is the core of this submission. It isn't just a count; it's a **multi-objective ethical framework** clipped to `[-1, 1]`: | |
| | Component | Logic | Purpose | | |
| | :--- | :--- | :--- | | |
| | **Throughput (+)** | `+0.20 * cars_cleared` | Incentivizes active vehicle flow. | | |
| | **Density (-)** | `-0.40 * total_congestion` | Penalizes letting the intersection fill up. | | |
| | **Bottleneck (-)** | `-0.15 * max_queue` | Discourages extreme build-up in any single lane. | | |
| | **Stability (-)** | `-switch_penalty` | Prevents "flickering" and promotes signal stability. | | |
| | **Fairness (+/-)** | `+0.10` bonus / `-penalty` | Rewards balanced service; penalizes starvation. | | |
| | **Emergency (🚨)** | `Golden Window` Bonus | Massive reward for clearing EVs within target steps. | | |
| | **EV Delay (-)** | `Exponential Penalty` | Punishes agents for delaying life-saving vehicles. | | |
| --- | |
| ## 📊 Evaluation Metrics | |
| We track **8 key performance indicators** per episode to ensure a winning submission can be quantified: | |
| 1. **Total Cleared**: Raw efficiency metric. | |
| 2. **Avg Waiting Time**: The "commuter frustration" index. | |
| 3. **Max Queue Length**: Gauges system robustness against bottlenecks. | |
| 4. **Signal Switch Count**: Measures policy stability. | |
| 5. **Congestion Score**: Final system state snapshot. | |
| 6. **Avg EV Clear Time**: Critical safety metric (lower is better). | |
| 7. **Fairness Score**: [0, 1] index — how equally did we serve all lanes? | |
| 8. **Total EV Penalty**: Measures total failure to prioritize safety. | |
| --- | |
| ## ⚡ Task Difficulty Levels | |
| | Parameter | Easy | Medium | Hard | | |
| | :--- | :--- | :--- | :--- | | |
| | **Arrival Rate** | 0–1 | 1–3 | 2–5 | | |
| | **Discharge Rate** | 4–5 | 3–5 | 2–4 | | |
| | **Burst Frequency** | 0% | 10% | 20% | | |
| | **Emergency Prob** | 1% | 5% | 15% | | |
| | **EV Golden Window** | 8 steps | 5 steps | 3 steps | | |
| | **Fairness Limit** | 20 steps | 15 steps | 10 steps | | |
| --- | |
| ## 🚑 Emergency & Fairness Logic | |
| ### The "Golden Window" | |
| When an Emergency Vehicle (EV) appears, the agent is granted a bonus if it switches and clears the lane within the **Golden Window** (defined per difficulty). Failing to do so triggers an **exponential delay penalty**, simulating the real-world cost of stopping an ambulance or fire truck. | |
| ### Fairness Guard | |
| To prevent "Starvation" (where the agent ignores a low-traffic lane to optimize throughput on a high-traffic lane), a **Fairness Score** is calculated. If a lane remains red beyond the **Starvation Limit**, the agent suffers a heavy penalty. This forces the agent to learn the complex trade-off between total throughput and social fairness. | |
| --- | |
| ## 🚶 Step Walkthrough | |
| ```text | |
| Step 12: 🚨 Ambulance detected in East lane (currently RED). | |
| - EW Queue: 4, EV Timer: 0 | |
| - Agent receives p_emergency penalty. | |
| Step 13: Agent Action: 1 (SWITCH to EW). | |
| - Switch penalty applied (-0.20). | |
| - NS lanes stop; EW lanes turn GREEN. | |
| Step 14: EV Cleared! | |
| - EV Clear Time: 2 steps. | |
| - Agent receives r_ev_bonus (+0.25) for "Golden Window" clearance. | |
| - Total cleared (+0.60 reward). | |
| ``` | |
| --- | |
| ## 🔮 Future Improvements | |
| - **Multi-Intersection Coordination**: Extending to a grid of agents using MARL. | |
| - **Pedestrian Logic**: Adding crosswalks and pedestrian priority. | |
| - **V2X Communication**: Providing agents with ahead-of-time traffic predictions. | |
| --- | |
| ## 📜 License | |
| MIT © 2026 Meta x PyTorch OpenEnv Hackathon | |