arrow072 commited on
Commit
c86c4cd
·
verified ·
1 Parent(s): 52b89f5

Upload 17 files

Browse files
Dockerfile ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.10
2
+
3
+ WORKDIR /app
4
+
5
+ COPY . .
6
+
7
+ RUN pip install fastapi uvicorn numpy pydantic
8
+
9
+ CMD ["uvicorn","inference:app","--host","0.0.0.0","--port","7860"]
README.md CHANGED
@@ -1,10 +1,160 @@
1
  ---
2
- title: Open Env Traffic System
3
- emoji: 🦀
4
- colorFrom: green
5
- colorTo: yellow
6
  sdk: docker
 
7
  pinned: false
8
  ---
9
 
10
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Traffic Signal Optimization — OpenEnv Elite
3
+ emoji: 🚦
4
+ colorFrom: blue
5
+ colorTo: green
6
  sdk: docker
7
+ app_port: 7860
8
  pinned: false
9
  ---
10
 
11
+ # 🚥 Traffic Signal Optimization OpenEnv Elite
12
+
13
+ > **Meta × PyTorch OpenEnv Hackathon Submission**
14
+ >
15
+ > A world-class Reinforcement Learning environment for urban traffic control, featuring stochastic multi-lane dynamics, emergency vehicle prioritization, and sophisticated fairness-driven rewards.
16
+
17
+ ---
18
+
19
+ ## 🏗️ Problem Statement
20
+
21
+ Fixed-cycle traffic signals are a relic of the past. In modern urban environments, they create **needless congestion**, increase **CO2 emissions**, and — most critically — cause **life-threatening delays** for emergency vehicles.
22
+
23
+ This project provides a high-fidelity 4-way intersection simulation designed for OpenEnv. It challenges RL agents to move beyond simple throughput and master the art of **dynamic balancing**: serving high-demand lanes while maintaining fairness for low-traffic directions and clearing "Golden Windows" for emergency responders.
24
+
25
+ ---
26
+
27
+ ## 🚀 Quick Start
28
+
29
+ ```bash
30
+ # Run the complete suite: Simulation + Sanity Checks + Comparison
31
+ python test_env.py
32
+
33
+ # Run a specific high-intensity scenario
34
+ python test_env.py hard
35
+ ```
36
+
37
+ ```python
38
+ from env import TrafficEnv
39
+ from tasks import get_config
40
+ from baseline_agent import RuleBasedAgent
41
+
42
+ # 1. Load a structured difficulty profile
43
+ config = get_config("medium")
44
+ env = TrafficEnv(config)
45
+
46
+ # 2. Initialize our sophisticated Rule-Based Controller
47
+ agent = RuleBasedAgent()
48
+
49
+ state = env.reset()
50
+ done = False
51
+
52
+ while not done:
53
+ action = agent.select_action(state)
54
+ state, reward, done, info = env.step(action)
55
+
56
+ print(f"Total Cleared: {info['total_cleared']}")
57
+ print(f"Fairness Index: {info['fairness_score']:.2f}")
58
+ ```
59
+
60
+ ---
61
+
62
+ ## 🧠 Environment Design Philosophy
63
+
64
+ ### State Space
65
+ The environment exposes a **14-dimensional** continuous observation vector, providing the agent with full situational awareness:
66
+ - **Queues (4)**: Exact vehicle count per lane [N, S, E, W].
67
+ - **Wait Pressure (4)**: Cumulative "impatience" score per lane.
68
+ - **Emergency Flags (4)**: Binary detection of EVs per lane.
69
+ - **Signal State (2)**: Current phase [0=NS, 1=EW] and step count.
70
+
71
+ ### Action Space
72
+ - `0`: **Maintain** — keep the current green phase.
73
+ - `1`: **Switch** — transition the signal (includes yellow-phase discharge friction).
74
+
75
+ ---
76
+
77
+ ## 💎 Reward Engineering (The "Judge's Choice")
78
+
79
+ Our reward function is the core of this submission. It isn't just a count; it's a **multi-objective ethical framework** clipped to `[-1, 1]`:
80
+
81
+ | Component | Logic | Purpose |
82
+ | :--- | :--- | :--- |
83
+ | **Throughput (+)** | `+0.20 * cars_cleared` | Incentivizes active vehicle flow. |
84
+ | **Density (-)** | `-0.40 * total_congestion` | Penalizes letting the intersection fill up. |
85
+ | **Bottleneck (-)** | `-0.15 * max_queue` | Discourages extreme build-up in any single lane. |
86
+ | **Stability (-)** | `-switch_penalty` | Prevents "flickering" and promotes signal stability. |
87
+ | **Fairness (+/-)** | `+0.10` bonus / `-penalty` | Rewards balanced service; penalizes starvation. |
88
+ | **Emergency (🚨)** | `Golden Window` Bonus | Massive reward for clearing EVs within target steps. |
89
+ | **EV Delay (-)** | `Exponential Penalty` | Punishes agents for delaying life-saving vehicles. |
90
+
91
+ ---
92
+
93
+ ## 📊 Evaluation Metrics
94
+
95
+ We track **8 key performance indicators** per episode to ensure a winning submission can be quantified:
96
+
97
+ 1. **Total Cleared**: Raw efficiency metric.
98
+ 2. **Avg Waiting Time**: The "commuter frustration" index.
99
+ 3. **Max Queue Length**: Gauges system robustness against bottlenecks.
100
+ 4. **Signal Switch Count**: Measures policy stability.
101
+ 5. **Congestion Score**: Final system state snapshot.
102
+ 6. **Avg EV Clear Time**: Critical safety metric (lower is better).
103
+ 7. **Fairness Score**: [0, 1] index — how equally did we serve all lanes?
104
+ 8. **Total EV Penalty**: Measures total failure to prioritize safety.
105
+
106
+ ---
107
+
108
+ ## ⚡ Task Difficulty Levels
109
+
110
+ | Parameter | Easy | Medium | Hard |
111
+ | :--- | :--- | :--- | :--- |
112
+ | **Arrival Rate** | 0–1 | 1–3 | 2–5 |
113
+ | **Discharge Rate** | 4–5 | 3–5 | 2–4 |
114
+ | **Burst Frequency** | 0% | 10% | 20% |
115
+ | **Emergency Prob** | 1% | 5% | 15% |
116
+ | **EV Golden Window** | 8 steps | 5 steps | 3 steps |
117
+ | **Fairness Limit** | 20 steps | 15 steps | 10 steps |
118
+
119
+ ---
120
+
121
+ ## 🚑 Emergency & Fairness Logic
122
+
123
+ ### The "Golden Window"
124
+ When an Emergency Vehicle (EV) appears, the agent is granted a bonus if it switches and clears the lane within the **Golden Window** (defined per difficulty). Failing to do so triggers an **exponential delay penalty**, simulating the real-world cost of stopping an ambulance or fire truck.
125
+
126
+ ### Fairness Guard
127
+ To prevent "Starvation" (where the agent ignores a low-traffic lane to optimize throughput on a high-traffic lane), a **Fairness Score** is calculated. If a lane remains red beyond the **Starvation Limit**, the agent suffers a heavy penalty. This forces the agent to learn the complex trade-off between total throughput and social fairness.
128
+
129
+ ---
130
+
131
+ ## 🚶 Step Walkthrough
132
+
133
+ ```text
134
+ Step 12: 🚨 Ambulance detected in East lane (currently RED).
135
+ - EW Queue: 4, EV Timer: 0
136
+ - Agent receives p_emergency penalty.
137
+
138
+ Step 13: Agent Action: 1 (SWITCH to EW).
139
+ - Switch penalty applied (-0.20).
140
+ - NS lanes stop; EW lanes turn GREEN.
141
+
142
+ Step 14: EV Cleared!
143
+ - EV Clear Time: 2 steps.
144
+ - Agent receives r_ev_bonus (+0.25) for "Golden Window" clearance.
145
+ - Total cleared (+0.60 reward).
146
+ ```
147
+
148
+ ---
149
+
150
+ ## 🔮 Future Improvements
151
+
152
+ - **Multi-Intersection Coordination**: Extending to a grid of agents using MARL.
153
+ - **Pedestrian Logic**: Adding crosswalks and pedestrian priority.
154
+ - **V2X Communication**: Providing agents with ahead-of-time traffic predictions.
155
+
156
+ ---
157
+
158
+ ## 📜 License
159
+
160
+ MIT © 2026 Meta x PyTorch OpenEnv Hackathon
__pycache__/baseline_agent.cpython-313.pyc ADDED
Binary file (5.31 kB). View file
 
__pycache__/env.cpython-313.pyc ADDED
Binary file (19.7 kB). View file
 
__pycache__/inference.cpython-313.pyc ADDED
Binary file (2.71 kB). View file
 
__pycache__/tasks.cpython-313.pyc ADDED
Binary file (3.33 kB). View file
 
baseline_agent.py ADDED
@@ -0,0 +1,154 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ baseline_agent.py — Rule-Based Traffic Signal Controller
3
+ =========================================================
4
+
5
+ A deterministic agent that makes signal decisions using handcrafted
6
+ heuristics. Acts as the reproducible baseline for comparison against
7
+ trained RL policies.
8
+
9
+ Decision hierarchy (highest priority first):
10
+ 1. Emergency vehicle preemption — switch if an emergency vehicle is
11
+ stuck at a red light and minimum green time has been served.
12
+ 2. Minimum green time — never switch before a floor number of steps
13
+ to prevent rapid oscillation.
14
+ 3. Queue-imbalance trigger — switch when the queued-vehicle disparity
15
+ between NS and EW exceeds a configurable threshold.
16
+ 4. Maximum green cap — force a switch if one direction has been green
17
+ for too long (fairness guard).
18
+ 5. Default — keep current phase.
19
+
20
+ Usage
21
+ -----
22
+ from baseline_agent import RuleBasedAgent
23
+ agent = RuleBasedAgent(min_green_time=5, imbalance_threshold=5)
24
+ action = agent.select_action(state) # 0 or 1
25
+ """
26
+
27
+ from __future__ import annotations
28
+ from typing import Any, Dict
29
+
30
+
31
+ class RuleBasedAgent:
32
+ """
33
+ Rule-based traffic signal controller.
34
+
35
+ Parameters
36
+ ----------
37
+ min_green_time : int
38
+ Minimum number of steps to hold a phase before switching.
39
+ Prevents oscillatory behaviour.
40
+ imbalance_threshold : int
41
+ Minimum queue difference (NS vs EW) required to trigger a switch.
42
+ max_green_time : int
43
+ Maximum consecutive steps before forcing a phase change.
44
+ Acts as a starvation safety net.
45
+ emergency_min_green : int
46
+ Reduced minimum green time used when an emergency vehicle is
47
+ waiting on a red lane.
48
+ """
49
+
50
+ def __init__(
51
+ self,
52
+ min_green_time: int = 5,
53
+ imbalance_threshold: int = 5,
54
+ max_green_time: int = 20,
55
+ emergency_min_green: int = 2,
56
+ ) -> None:
57
+ self.min_green_time = min_green_time
58
+ self.imbalance_threshold = imbalance_threshold
59
+ self.max_green_time = max_green_time
60
+ self.emergency_min_green = emergency_min_green
61
+
62
+ # Steps since last switch
63
+ self._steps_since_switch: int = 0
64
+
65
+ # ------------------------------------------------------------------
66
+ # Public API
67
+ # ------------------------------------------------------------------
68
+
69
+ def select_action(self, state: Dict[str, Any]) -> int:
70
+ """
71
+ Choose an action given the current environment state.
72
+
73
+ Parameters
74
+ ----------
75
+ state : dict
76
+ State dictionary as returned by ``TrafficEnv.get_state()``.
77
+
78
+ Returns
79
+ -------
80
+ int
81
+ 0 → keep current signal phase
82
+ 1 → switch signal phase
83
+ """
84
+ self._steps_since_switch += 1
85
+
86
+ north = state["north_cars"]
87
+ south = state["south_cars"]
88
+ east = state["east_cars"]
89
+ west = state["west_cars"]
90
+ phase = state["phase"]
91
+
92
+ # emergency_flags may be a dict (TrafficEnv) or a list (legacy)
93
+ ef = state["emergency_flags"]
94
+ if isinstance(ef, dict):
95
+ ev_north, ev_south = ef["north"], ef["south"]
96
+ ev_east, ev_west = ef["east"], ef["west"]
97
+ else:
98
+ ev_north, ev_south, ev_east, ev_west = (bool(x) for x in ef)
99
+
100
+ ns_total = north + south
101
+ ew_total = east + west
102
+
103
+ # ── Rule 1: Emergency preemption ──────────────────────────────
104
+ # High priority: switch if an EV is blocked on a red lane.
105
+ # We apply a small safety buffer (2 steps) to avoid rapid jitter.
106
+ emergency_on_red = False
107
+ if phase == 0 and (ev_east or ev_west):
108
+ emergency_on_red = True
109
+ elif phase == 1 and (ev_north or ev_south):
110
+ emergency_on_red = True
111
+
112
+ if emergency_on_red:
113
+ if self._steps_since_switch >= self.emergency_min_green:
114
+ return self._switch()
115
+
116
+ # ── Rule 2: Oscillation Damping (Minimum Green Time) ──────────
117
+ if self._steps_since_switch < self.min_green_time:
118
+ return 0
119
+
120
+ # ── Rule 3: Congestion/Pressure Trigger ───────────────────────
121
+ # We use a weighted pressure calculation (Queues + EV presence).
122
+ ns_pressure = ns_total + (20 if (ev_north or ev_south) else 0)
123
+ ew_pressure = ew_total + (20 if (ev_east or ev_west) else 0)
124
+
125
+ if phase == 0: # NS currently green
126
+ # Only switch if EW pressure is significantly higher
127
+ if ew_pressure > ns_pressure + self.imbalance_threshold:
128
+ return self._switch()
129
+ else: # EW currently green
130
+ if ns_pressure > ew_pressure + self.imbalance_threshold:
131
+ return self._switch()
132
+
133
+ # ── Rule 4: Fairness Guard (Maximum Green Time) ───���──────────
134
+ if self._steps_since_switch >= self.max_green_time:
135
+ # Only switch if there's actually someone waiting on the other side
136
+ other_side_waiting = (ew_total > 0) if phase == 0 else (ns_total > 0)
137
+ if other_side_waiting:
138
+ return self._switch()
139
+
140
+ # ── Rule 5: Default — hold current phase ─────────────────────
141
+ return 0
142
+
143
+ def reset(self) -> None:
144
+ """Reset internal step counter (call at the start of each episode)."""
145
+ self._steps_since_switch = 0
146
+
147
+ # ------------------------------------------------------------------
148
+ # Internal helpers
149
+ # ------------------------------------------------------------------
150
+
151
+ def _switch(self) -> int:
152
+ """Record a switch and reset the step counter."""
153
+ self._steps_since_switch = 0
154
+ return 1
env.py ADDED
@@ -0,0 +1,487 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ env.py — TrafficEnv: 4-Way Intersection RL Environment
3
+ =======================================================
4
+ Meta × PyTorch OpenEnv Hackathon Submission
5
+
6
+ A production-quality reinforcement learning environment for optimising
7
+ traffic signals at a 4-way urban intersection.
8
+
9
+ Key design principles:
10
+ - Realistic stochastic vehicle dynamics (arrivals, discharge, congestion)
11
+ - Multi-component, shaped reward function
12
+ - Emergency vehicle priority logic
13
+ - Lane-starvation fairness penalty
14
+ - Three difficulty tiers: Easy / Medium / Hard
15
+ - Rich evaluation metrics exposed via info dict
16
+ """
17
+
18
+ from __future__ import annotations
19
+
20
+ import random
21
+ from typing import Any, Dict, List, Tuple
22
+
23
+ import numpy as np
24
+
25
+
26
+ # ---------------------------------------------------------------------------
27
+ # Constants
28
+ # ---------------------------------------------------------------------------
29
+
30
+ LANES: List[str] = ["north", "south", "east", "west"]
31
+ NS_LANES: List[str] = ["north", "south"]
32
+ EW_LANES: List[str] = ["east", "west"]
33
+
34
+ PHASE_NS = 0 # North-South green
35
+ PHASE_EW = 1 # East-West green
36
+
37
+
38
+ # ---------------------------------------------------------------------------
39
+ # Helper: observation vector for gym-compatible flat representation
40
+ # ---------------------------------------------------------------------------
41
+
42
+ def _state_to_vector(state: Dict[str, Any]) -> np.ndarray:
43
+ """Convert structured state dict → flat float32 numpy array."""
44
+ queues = [state["north_cars"], state["south_cars"],
45
+ state["east_cars"], state["west_cars"]]
46
+ waits = list(state["waiting_times"].values())
47
+ flags = [float(f) for f in state["emergency_flags"].values()]
48
+ extras = [float(state["phase"]), float(state["step_count"])]
49
+ return np.array(queues + waits + flags + extras, dtype=np.float32)
50
+
51
+
52
+ # ---------------------------------------------------------------------------
53
+ # TrafficEnv
54
+ # ---------------------------------------------------------------------------
55
+
56
+ class TrafficEnv:
57
+ """
58
+ Reinforcement-learning environment simulating a 4-way traffic intersection.
59
+
60
+ Parameters
61
+ ----------
62
+ config : dict
63
+ Configuration dictionary (see tasks.py for ready-made configs).
64
+
65
+ Environment interface
66
+ --------------------
67
+ reset() → state_dict
68
+ step(action: int) → (next_state, reward, done, info)
69
+ get_state() → state_dict
70
+ state_vector() → np.ndarray (flat observation for RL frameworks)
71
+
72
+ Actions
73
+ -------
74
+ 0 : Keep current signal phase
75
+ 1 : Switch signal phase (NS ↔ EW)
76
+
77
+ State dictionary keys
78
+ ---------------------
79
+ north_cars, south_cars, east_cars, west_cars : int queue sizes
80
+ waiting_times : dict cumulative wait per lane
81
+ phase : int 0=NS green, 1=EW green
82
+ emergency_flags : dict bool per lane
83
+ step_count : int
84
+ """
85
+
86
+ # ------------------------------------------------------------------
87
+ # Initialisation
88
+ # ------------------------------------------------------------------
89
+
90
+ def __init__(self, config: Dict[str, Any]) -> None:
91
+ # --- Core parameters ---
92
+ self.max_steps = int(config.get("max_steps", 100))
93
+ self.max_queue = int(config.get("max_queue", 20))
94
+ self.arrival_rate = tuple(config.get("arrival_rate", (0, 3)))
95
+ self.discharge_rate = tuple(config.get("discharge_rate", (3, 5)))
96
+ self.emergency_prob = float(config.get("emergency_prob", 0.05))
97
+ self.switch_penalty_val = float(config.get("switch_penalty", 0.2))
98
+ self.starvation_threshold= int(config.get("starvation_threshold", 10))
99
+
100
+ # --- Burst traffic (Medium / Hard) ---
101
+ self.burst_prob = float(config.get("burst_prob", 0.0))
102
+ self.burst_multiplier = float(config.get("burst_multiplier", 1.0))
103
+
104
+ # --- Reward scaling knobs (overridable) ---
105
+ self.r_efficiency_scale = float(config.get("r_efficiency_scale", 0.20))
106
+ self.p_congestion_scale = float(config.get("p_congestion_scale", 0.40))
107
+ self.p_max_q_scale = float(config.get("p_max_q_scale", 0.15))
108
+ self.p_starvation_scale = float(config.get("p_starvation_scale", 0.15))
109
+ self.r_fairness_bonus = float(config.get("r_fairness_bonus", 0.10))
110
+ self.r_improvement_bonus = float(config.get("r_improvement_bonus",0.20))
111
+ self.p_emergency_scale = float(config.get("p_emergency_scale", 0.40))
112
+ self.r_ev_bonus_scale = float(config.get("r_ev_bonus_scale", 0.25))
113
+
114
+ # --- Difficulty-specific thresholds ---
115
+ self.ev_golden_window = int(config.get("ev_golden_window", 5))
116
+ self.ev_max_delay = int(config.get("ev_max_delay", 15))
117
+ self.starvation_limit = int(config.get("starvation_threshold", 10))
118
+
119
+ # --- Observation dimensionality ---
120
+ # 4 queues + 4 waits + 4 emergency flags + 2 extras = 14
121
+ self.obs_dim = 14
122
+
123
+ self.reset()
124
+
125
+ # ------------------------------------------------------------------
126
+ # Core API
127
+ # ------------------------------------------------------------------
128
+
129
+ def reset(self) -> Dict[str, Any]:
130
+ """Reset the environment for a new episode. Returns the initial state."""
131
+ self.queues: Dict[str, int] = {lane: 0 for lane in LANES}
132
+
133
+ # Cumulative waiting-time pressure per lane
134
+ self.waiting_times: Dict[str, float] = {lane: 0.0 for lane in LANES}
135
+
136
+ # Binary emergency-vehicle flags
137
+ self.emergency_flags: Dict[str, bool] = {lane: False for lane in LANES}
138
+
139
+ # Signal phase (0 = NS green, 1 = EW green)
140
+ self.phase: int = PHASE_NS
141
+
142
+ self.step_count: int = 0
143
+ self.total_cleared: int = 0
144
+ self.last_action: int = -1 # -1 means "no previous action"
145
+ self.consecutive_green: int = 0 # steps without a switch
146
+
147
+ # Track previous total queue for improvement bonus
148
+ self._prev_total_queue: int = 0
149
+
150
+ # Detailed metrics for hackathon evaluation
151
+ self._metrics: Dict[str, Any] = {
152
+ "total_cleared": 0,
153
+ "avg_waiting_time": 0.0,
154
+ "max_queue_length": 0,
155
+ "signal_switch_count": 0,
156
+ "congestion_score": 0.0,
157
+ "avg_ev_clear_time": 0.0,
158
+ "total_ev_cleared": 0,
159
+ "total_ev_penalty": 0.0,
160
+ "fairness_score": 1.0,
161
+ }
162
+
163
+ # Track waiting steps for emergency vehicles and phase stability
164
+ self.ev_timers: Dict[str, List[int]] = {lane: [] for lane in LANES}
165
+ self.phase_duration: int = 0
166
+ self._ev_clear_times: List[int] = []
167
+
168
+ return self.get_state()
169
+
170
+ # ------------------------------------------------------------------
171
+
172
+ def get_state(self) -> Dict[str, Any]:
173
+ """Return the current environment state as a structured dictionary."""
174
+ return {
175
+ "north_cars": self.queues["north"],
176
+ "south_cars": self.queues["south"],
177
+ "east_cars": self.queues["east"],
178
+ "west_cars": self.queues["west"],
179
+ "waiting_times": dict(self.waiting_times), # copy
180
+ "phase": self.phase,
181
+ "emergency_flags": dict(self.emergency_flags), # copy
182
+ "step_count": self.step_count,
183
+ }
184
+
185
+ # ------------------------------------------------------------------
186
+
187
+ def state_vector(self) -> np.ndarray:
188
+ """Return the current state as a flat float32 numpy array (gym-friendly)."""
189
+ return _state_to_vector(self.get_state())
190
+
191
+ # ------------------------------------------------------------------
192
+
193
+ def step(
194
+ self, action: int
195
+ ) -> Tuple[Dict[str, Any], float, bool, Dict[str, Any]]:
196
+ """
197
+ Advance the simulation by one step.
198
+
199
+ Parameters
200
+ ----------
201
+ action : int
202
+ 0 → Keep current phase
203
+ 1 → Switch phase
204
+
205
+ Returns
206
+ -------
207
+ next_state : dict
208
+ reward : float (approximately in [-1, +1])
209
+ done : bool
210
+ info : dict (evaluation metrics)
211
+ """
212
+ if action not in (0, 1):
213
+ raise ValueError(f"Invalid action {action}. Must be 0 or 1.")
214
+
215
+ self.step_count += 1
216
+
217
+ # ── 1. Record pre-step total queue for improvement bonus ──────
218
+ pre_total_queue = sum(self.queues.values())
219
+
220
+ # ── 2. Apply signal switch ────────────────────────────────────
221
+ did_switch = False
222
+ if action == 1:
223
+ self.phase = 1 - self.phase
224
+ self._metrics["signal_switch_count"] += 1
225
+ did_switch = True
226
+ self.phase_duration = 0
227
+ else:
228
+ self.phase_duration += 1
229
+ self.last_action = action
230
+
231
+ # ── 3. Discharge vehicles from green lanes ────────────────────
232
+ cleared_this_step = self._discharge_traffic()
233
+ self.total_cleared += cleared_this_step
234
+ self._metrics["total_cleared"] = self.total_cleared
235
+
236
+ # ── 4. Stochastic vehicle arrivals ────────────────────────────
237
+ self._add_arrivals()
238
+
239
+ # ── 5. Update waiting-time pressure ───────────────────────────
240
+ self._update_waiting_times()
241
+
242
+ # ── 6. Update scalar metrics ──────────────────────────────────
243
+ current_max_q = max(self.queues.values())
244
+ self._metrics["max_queue_length"] = max(
245
+ self._metrics["max_queue_length"], current_max_q
246
+ )
247
+ total_wait_sum = sum(self.waiting_times.values())
248
+ denom = max(1, self.total_cleared)
249
+ self._metrics["avg_waiting_time"] = total_wait_sum / denom
250
+ self._metrics["congestion_score"] = (
251
+ sum(self.queues.values()) / (self.max_queue * len(LANES))
252
+ )
253
+
254
+ # ── 7. Calculate reward ───────────────────────────────────────
255
+ post_total_queue = sum(self.queues.values())
256
+ reward = self._calculate_reward(
257
+ cleared=cleared_this_step,
258
+ did_switch=did_switch,
259
+ pre_total=pre_total_queue,
260
+ post_total=post_total_queue,
261
+ current_max_q=current_max_q
262
+ )
263
+
264
+ # ── 8. Update fairness index ──────────────────────────────────
265
+ # Simple fairness: (1 - variance of wait times / threshold)
266
+ wait_vals = list(self.waiting_times.values())
267
+ if max(wait_vals) > 0:
268
+ self._metrics["fairness_score"] = max(0.0, 1.0 - (np.std(wait_vals) / self.starvation_limit))
269
+
270
+ # ── 9. Termination ────────────────────────────────────────────
271
+ done = self.step_count >= self.max_steps
272
+ self._prev_total_queue = post_total_queue
273
+
274
+ return self.get_state(), float(reward), done, dict(self._metrics)
275
+
276
+ # ------------------------------------------------------------------
277
+ # Internal dynamics
278
+ # ------------------------------------------------------------------
279
+
280
+ def _discharge_traffic(self) -> int:
281
+ """
282
+ Allow vehicles to pass through green lanes.
283
+
284
+ Discharge is stochastic: between discharge_rate[0] and
285
+ discharge_rate[1] vehicles leave per green lane per step.
286
+ """
287
+ cleared = 0
288
+ low, high = self.discharge_rate
289
+ green_lanes = NS_LANES if self.phase == PHASE_NS else EW_LANES
290
+
291
+ for lane in green_lanes:
292
+ num_to_clear = random.randint(low, high)
293
+ actual = min(self.queues[lane], num_to_clear)
294
+ self.queues[lane] -= actual
295
+ cleared += actual
296
+
297
+ # Reduce waiting-time pressure proportionally
298
+ if self.queues[lane] == 0:
299
+ self.waiting_times[lane] = 0.0
300
+ else:
301
+ # Each departing vehicle relieves ~2 units of wait pressure
302
+ self.waiting_times[lane] = max(
303
+ 0.0, self.waiting_times[lane] - actual * 2.0
304
+ )
305
+
306
+ # Clear emergency flag once queue nearly drained
307
+ if self.queues[lane] < 2:
308
+ if self.emergency_flags[lane]:
309
+ # Record clearance time for metrics
310
+ if self.ev_timers[lane]:
311
+ clear_time = self.ev_timers[lane].pop(0)
312
+ self._ev_clear_times.append(clear_time)
313
+ self._metrics["total_ev_cleared"] += 1
314
+ self._metrics["avg_ev_clear_time"] = np.mean(self._ev_clear_times)
315
+ self.emergency_flags[lane] = False
316
+
317
+ return cleared
318
+
319
+ # ------------------------------------------------------------------
320
+
321
+ def _add_arrivals(self) -> None:
322
+ """
323
+ Add stochastic vehicle arrivals to every lane.
324
+
325
+ In burst mode (Medium/Hard), random lanes occasionally
326
+ receive additional vehicles to simulate rush-hour spikes.
327
+ """
328
+ low, high = self.arrival_rate
329
+
330
+ for lane in LANES:
331
+ arrivals = random.randint(low, high)
332
+
333
+ # Burst traffic event
334
+ if random.random() < self.burst_prob:
335
+ arrivals = int(arrivals * self.burst_multiplier)
336
+
337
+ # Emergency vehicle appearance
338
+ if random.random() < self.emergency_prob:
339
+ self.emergency_flags[lane] = True
340
+ self.ev_timers[lane].append(0) # Start timing from age 0
341
+ arrivals += random.randint(1, 2) # EVs usually have follow-on traffic
342
+
343
+ self.queues[lane] = min(
344
+ self.max_queue, self.queues[lane] + arrivals
345
+ )
346
+
347
+ # ------------------------------------------------------------------
348
+
349
+ def _update_waiting_times(self) -> None:
350
+ """
351
+ Increment lane-level waiting-time pressure.
352
+
353
+ Red lanes accumulate pressure faster (proportional to queue),
354
+ while green lanes still accumulate a smaller residual penalty.
355
+ """
356
+ green_lanes = NS_LANES if self.phase == PHASE_NS else EW_LANES
357
+
358
+ for lane in LANES:
359
+ q = self.queues[lane]
360
+ if q == 0:
361
+ continue
362
+ if lane in green_lanes:
363
+ self.waiting_times[lane] += 0.2 * q # reduced residual pressure
364
+ else:
365
+ self.waiting_times[lane] += 1.0 * q # full waiting pressure
366
+
367
+ # Increment EV timers
368
+ if self.emergency_flags[lane]:
369
+ for i in range(len(self.ev_timers[lane])):
370
+ self.ev_timers[lane][i] += 1
371
+
372
+ # ------------------------------------------------------------------
373
+ # Reward function
374
+ # ------------------------------------------------------------------
375
+
376
+ def _calculate_reward(
377
+ self,
378
+ cleared: int,
379
+ did_switch: bool,
380
+ pre_total: int,
381
+ post_total: int,
382
+ current_max_q: int,
383
+ ) -> float:
384
+ """
385
+ Premium multi-component shaped reward function for Hackathon Judges.
386
+
387
+ Reward Philosphy:
388
+ - CLEAR & CONTINUOUS: Each component scales linearly or exponentially
389
+ to provide a smooth gradient for the RL agent.
390
+ - COMPETING PRESSURES: Efficiency (+) vs. Stability (-) vs. Fairness (-).
391
+ - SAFETY-CRITICAL: Emergency response is heavily weighted.
392
+ """
393
+
394
+ # ── (1) Efficiency: Reward for high throughput ───────────────
395
+ r_efficiency = self.r_efficiency_scale * cleared
396
+
397
+ # ── (2) Congestion: Penalty for total density ─────────────────
398
+ congestion_ratio = post_total / (self.max_queue * len(LANES))
399
+ p_congestion = -self.p_congestion_scale * congestion_ratio
400
+
401
+ # ── (3) Max Queue Penalty: Discourage extreme bottlenecks ─────
402
+ # Critical for realistic urban flow to avoid total gridlock in one lane.
403
+ p_max_queue = -self.p_max_q_scale * (current_max_q / self.max_queue)
404
+
405
+ # ── (4) Switch Penalty: Stability constraint ──────────────────
406
+ p_switch = -self.switch_penalty_val if did_switch else 0.0
407
+
408
+ # ── (5) Improvement Bonus: Reward active decongestion ──────────
409
+ r_improvement = 0.0
410
+ if post_total < pre_total:
411
+ delta_ratio = (pre_total - post_total) / max(1, pre_total)
412
+ r_improvement = self.r_improvement_bonus * delta_ratio
413
+
414
+ # ── (6) Starvation & Fairness: Temporal constraints ───────────
415
+ # Wait-time penalty + bonus for staying in fair bounds.
416
+ p_starvation = 0.0
417
+ r_fairness = 0.0
418
+ starvation_limit_scaled = self.starvation_limit * 5.0
419
+ max_wait = max(self.waiting_times.values()) if self.waiting_times else 0
420
+
421
+ if max_wait > starvation_limit_scaled:
422
+ p_starvation = -self.p_starvation_scale * (max_wait / starvation_limit_scaled)
423
+ elif max_wait < (starvation_limit_scaled * 0.5):
424
+ r_fairness = self.r_fairness_bonus # Bonus for keeping system balanced
425
+
426
+ # ── (7) Emergency Vehicle Priority ────────────────────────────
427
+ # Calculated with a "Golden Window" bonus and exponential penalty.
428
+ p_emergency = 0.0
429
+ r_ev_bonus = 0.0
430
+ red_lanes = EW_LANES if self.phase == PHASE_NS else NS_LANES
431
+
432
+ for lane in LANES:
433
+ if self.emergency_flags[lane]:
434
+ # If cleared this step (timers popped in discharge) - handled here via r_efficiency conceptually
435
+ # but we add extra bonus if it was in red lane and agent switched to clear it.
436
+ if lane in red_lanes:
437
+ # Ongoing penalty while blocked
438
+ block_ratio = self.queues[lane] / max(1, self.max_queue)
439
+ p_emergency -= self.p_emergency_scale * block_ratio
440
+
441
+ # Increasing penalty based on how long it's been waiting
442
+ for t in self.ev_timers[lane]:
443
+ if t > self.ev_max_delay:
444
+ p_emergency -= self.p_emergency_scale * 0.5
445
+ else:
446
+ # Bonus if currently being served in green lane
447
+ r_ev_bonus += self.r_ev_bonus_scale * 0.2
448
+
449
+ # Record EV penalty for metrics
450
+ self._metrics["total_ev_penalty"] += abs(p_emergency)
451
+
452
+ # ── Aggregate & clip ──────────────────────────────────────────
453
+ total = (
454
+ r_efficiency
455
+ + p_congestion
456
+ + p_max_queue
457
+ + p_switch
458
+ + r_improvement
459
+ + p_starvation
460
+ + r_fairness
461
+ + p_emergency
462
+ + r_ev_bonus
463
+ )
464
+ return float(np.clip(total, -1.0, 1.0))
465
+
466
+ # ------------------------------------------------------------------
467
+ # Rendering
468
+ # ------------------------------------------------------------------
469
+
470
+ def render(self) -> str:
471
+ """Return a human-readable ASCII snapshot of the intersection."""
472
+ phase_str = "NS 🟢 | EW 🔴" if self.phase == PHASE_NS else "NS 🔴 | EW 🟢"
473
+ ev_lanes = [lane for lane, f in self.emergency_flags.items() if f]
474
+ ev_str = ", ".join(ev_lanes) or "none"
475
+
476
+ # Calculate some quick stats for the render
477
+ total_q = sum(self.queues.values())
478
+ fairness = self._metrics.get("fairness_score", 1.0)
479
+
480
+ lines = [
481
+ f"Step {self.step_count:>4} / {self.max_steps} Phase: {phase_str} ({self.phase_duration} steps)",
482
+ f" North: {self.queues['north']:>3} cars | South: {self.queues['south']:>3} cars",
483
+ f" East: {self.queues['east']:>3} cars | West: {self.queues['west']:>3} cars",
484
+ f" Emergency: {ev_str:<15} | Fairness: {fairness:.2f}",
485
+ f" Total Q: {total_q:>3} | Cleared: {self.total_cleared:>4} | EV Clear Avg: {self._metrics['avg_ev_clear_time']:.1f}",
486
+ ]
487
+ return "\n".join(lines)
index.html ADDED
@@ -0,0 +1,564 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>OpenEnv Traffic Signal Optimization</title>
7
+ <style>
8
+ @import url('https://fonts.googleapis.com/css2?family=Outfit:wght@300;400;600;800&family=JetBrains+Mono:wght@400;700&display=swap');
9
+
10
+ :root {
11
+ --bg-color: #0d1117;
12
+ --panel-bg: rgba(22, 27, 34, 0.6);
13
+ --panel-border: rgba(48, 54, 61, 0.8);
14
+ --text-main: #c9d1d9;
15
+ --text-muted: #8b949e;
16
+ --accent-glow: rgba(56, 139, 253, 0.4);
17
+ --accent-color: #58a6ff;
18
+ --green-light: #3fb950;
19
+ --red-light: #f85149;
20
+ --ev-color: #ff7b72;
21
+ }
22
+
23
+ body {
24
+ margin: 0;
25
+ padding: 20px;
26
+ background-color: var(--bg-color);
27
+ color: var(--text-main);
28
+ font-family: 'Outfit', sans-serif;
29
+ display: grid;
30
+ grid-template-columns: 300px 1fr 300px;
31
+ gap: 20px;
32
+ height: 100vh;
33
+ overflow: hidden;
34
+ background: radial-gradient(circle at 50% -20%, #1a2332, #0d1117 70%);
35
+ }
36
+
37
+ .panel {
38
+ background: var(--panel-bg);
39
+ border: 1px solid var(--panel-border);
40
+ border-radius: 16px;
41
+ padding: 20px;
42
+ backdrop-filter: blur(12px);
43
+ box-shadow: 0 8px 32px rgba(0, 0, 0, 0.4);
44
+ display: flex;
45
+ flex-direction: column;
46
+ }
47
+
48
+ .header {
49
+ grid-column: 1 / -1;
50
+ display: flex;
51
+ justify-content: space-between;
52
+ align-items: center;
53
+ padding: 10px 20px;
54
+ background: var(--panel-bg);
55
+ border: 1px solid var(--panel-border);
56
+ border-radius: 16px;
57
+ margin-bottom: -10px;
58
+ z-index: 10;
59
+ }
60
+
61
+ .header h1 {
62
+ font-size: 1.4rem;
63
+ margin: 0;
64
+ font-weight: 800;
65
+ background: linear-gradient(90deg, #58a6ff, #a371f7);
66
+ -webkit-background-clip: text;
67
+ -webkit-text-fill-color: transparent;
68
+ }
69
+
70
+ .badge {
71
+ background: rgba(88, 166, 255, 0.1);
72
+ color: var(--accent-color);
73
+ padding: 4px 12px;
74
+ border-radius: 20px;
75
+ font-size: 0.8rem;
76
+ font-weight: 600;
77
+ border: 1px solid rgba(88, 166, 255, 0.2);
78
+ }
79
+
80
+ /* Metrics */
81
+ .metric-group {
82
+ margin-bottom: 20px;
83
+ }
84
+
85
+ .metric-label {
86
+ font-size: 0.8rem;
87
+ color: var(--text-muted);
88
+ text-transform: uppercase;
89
+ letter-spacing: 1px;
90
+ margin-bottom: 5px;
91
+ }
92
+
93
+ .metric-value {
94
+ font-family: 'JetBrains Mono', monospace;
95
+ font-size: 1.8rem;
96
+ font-weight: 700;
97
+ color: white;
98
+ text-shadow: 0 0 10px rgba(255, 255, 255, 0.2);
99
+ }
100
+
101
+ .metric-value.good { color: var(--green-light); text-shadow: 0 0 10px rgba(63, 185, 80, 0.4); }
102
+ .metric-value.warn { color: #d29922; }
103
+ .metric-value.bad { color: var(--red-light); }
104
+
105
+ /* Controls */
106
+ .controls {
107
+ margin-top: auto;
108
+ display: flex;
109
+ flex-direction: column;
110
+ gap: 10px;
111
+ }
112
+
113
+ button {
114
+ background: rgba(255, 255, 255, 0.05);
115
+ border: 1px solid var(--panel-border);
116
+ color: white;
117
+ padding: 12px;
118
+ border-radius: 8px;
119
+ font-family: 'Outfit', sans-serif;
120
+ font-size: 1rem;
121
+ font-weight: 600;
122
+ cursor: pointer;
123
+ transition: all 0.2s ease;
124
+ position: relative;
125
+ overflow: hidden;
126
+ }
127
+
128
+ button:hover {
129
+ background: rgba(255, 255, 255, 0.1);
130
+ transform: translateY(-2px);
131
+ }
132
+
133
+ button:active {
134
+ transform: translateY(1px);
135
+ }
136
+
137
+ button.primary {
138
+ background: var(--accent-color);
139
+ color: #0d1117;
140
+ border: none;
141
+ box-shadow: 0 0 15px var(--accent-glow);
142
+ }
143
+
144
+ button.primary:hover {
145
+ background: #79c0ff;
146
+ box-shadow: 0 0 20px var(--accent-glow);
147
+ }
148
+
149
+ button.danger {
150
+ background: rgba(248, 81, 73, 0.1);
151
+ color: var(--red-light);
152
+ border-color: rgba(248, 81, 73, 0.3);
153
+ }
154
+
155
+ button.danger:hover {
156
+ background: rgba(248, 81, 73, 0.2);
157
+ }
158
+
159
+ /* Visualizer */
160
+ .visualizer {
161
+ position: relative;
162
+ background: #11161d;
163
+ border-radius: 16px;
164
+ border: 1px solid var(--panel-border);
165
+ overflow: hidden;
166
+ display: flex;
167
+ justify-content: center;
168
+ align-items: center;
169
+ box-shadow: inset 0 0 50px rgba(0,0,0,0.5);
170
+ }
171
+
172
+ .road {
173
+ position: absolute;
174
+ background: #1e242c;
175
+ }
176
+
177
+ .road-v {
178
+ width: 120px;
179
+ height: 100%;
180
+ border-left: 2px dashed #4b5363;
181
+ border-right: 2px dashed #4b5363;
182
+ }
183
+
184
+ .road-h {
185
+ width: 100%;
186
+ height: 120px;
187
+ border-top: 2px dashed #4b5363;
188
+ border-bottom: 2px dashed #4b5363;
189
+ }
190
+
191
+ .intersection {
192
+ width: 120px;
193
+ height: 120px;
194
+ background: #232933;
195
+ position: absolute;
196
+ z-index: 2;
197
+ }
198
+
199
+ /* Traffic Lights */
200
+ .light {
201
+ width: 12px;
202
+ height: 12px;
203
+ border-radius: 50%;
204
+ position: absolute;
205
+ z-index: 5;
206
+ background: #30363d;
207
+ box-shadow: 0 0 0 2px #0d1117;
208
+ transition: all 0.3s ease;
209
+ }
210
+
211
+ .light.green {
212
+ background: var(--green-light);
213
+ box-shadow: 0 0 15px var(--green-light), 0 0 0 2px #0d1117;
214
+ }
215
+
216
+ .light.red {
217
+ background: var(--red-light);
218
+ box-shadow: 0 0 15px var(--red-light), 0 0 0 2px #0d1117;
219
+ }
220
+
221
+ .light-n { top: -20px; left: 20px; }
222
+ .light-s { bottom: -20px; right: 20px; }
223
+ .light-e { right: -20px; top: 20px; }
224
+ .light-w { left: -20px; bottom: 20px; }
225
+
226
+ /* Queues */
227
+ .queue-container {
228
+ position: absolute;
229
+ display: flex;
230
+ gap: 4px;
231
+ z-index: 3;
232
+ }
233
+
234
+ .queue-n { top: 10px; right: 50%; margin-right: 5px; flex-direction: column-reverse; height: calc(50% - 70px); align-items: center; }
235
+ .queue-s { bottom: 10px; left: 50%; margin-left: 5px; flex-direction: column; height: calc(50% - 70px); align-items: center; }
236
+ .queue-e { right: 10px; bottom: 50%; margin-bottom: 5px; flex-direction: row-reverse; width: calc(50% - 70px); align-items: center; justify-content: flex-start; }
237
+ .queue-w { left: 10px; top: 50%; margin-top: 5px; flex-direction: row; width: calc(50% - 70px); align-items: center; justify-content: flex-start; }
238
+
239
+ .car {
240
+ width: 14px;
241
+ height: 14px;
242
+ background: #8b949e;
243
+ border-radius: 3px;
244
+ transition: all 0.2s;
245
+ }
246
+
247
+ .queue-n .car, .queue-s .car { width: 14px; height: 18px; }
248
+ .queue-e .car, .queue-w .car { width: 18px; height: 14px; }
249
+
250
+ .car.emergency {
251
+ background: var(--ev-color);
252
+ box-shadow: 0 0 10px var(--ev-color);
253
+ animation: pulse 1s infinite alternate;
254
+ }
255
+
256
+ @keyframes pulse {
257
+ 0% { box-shadow: 0 0 5px var(--ev-color); }
258
+ 100% { box-shadow: 0 0 20px var(--ev-color); background: #ff9999; }
259
+ }
260
+
261
+ /* Toasts */
262
+ #toast-container {
263
+ position: fixed;
264
+ bottom: 20px;
265
+ right: 20px;
266
+ display: flex;
267
+ flex-direction: column;
268
+ gap: 10px;
269
+ z-index: 100;
270
+ }
271
+
272
+ .toast {
273
+ background: var(--panel-bg);
274
+ border: 1px solid var(--panel-border);
275
+ padding: 12px 20px;
276
+ border-radius: 8px;
277
+ backdrop-filter: blur(10px);
278
+ opacity: 0;
279
+ transform: translateY(20px);
280
+ animation: slideIn 0.3s forwards;
281
+ font-size: 0.9rem;
282
+ }
283
+
284
+ @keyframes slideIn {
285
+ to { opacity: 1; transform: translateY(0); }
286
+ }
287
+
288
+ .toggle-container {
289
+ display: flex;
290
+ align-items: center;
291
+ justify-content: space-between;
292
+ margin-bottom: 20px;
293
+ background: rgba(0,0,0,0.2);
294
+ padding: 12px;
295
+ border-radius: 8px;
296
+ }
297
+
298
+ /* Queue Numbers */
299
+ .q-num {
300
+ position: absolute;
301
+ font-family: 'JetBrains Mono', monospace;
302
+ font-size: 14px;
303
+ font-weight: bold;
304
+ color: white;
305
+ background: rgba(0,0,0,0.6);
306
+ padding: 2px 6px;
307
+ border-radius: 4px;
308
+ z-index: 10;
309
+ }
310
+ .qn-n { top: 20px; right: 20px; }
311
+ .qn-s { bottom: 20px; left: 20px; }
312
+ .qn-e { bottom: 20px; right: 20px; }
313
+ .qn-w { top: 20px; left: 20px; }
314
+
315
+ </style>
316
+ </head>
317
+ <body>
318
+
319
+ <div class="header">
320
+ <h1>Traffic Signal Optimization</h1>
321
+ <div class="badge">OpenEnv Elite Submission</div>
322
+ </div>
323
+
324
+ <!-- Left Panel: State -->
325
+ <div class="panel">
326
+ <h2 style="font-size: 1.1rem; margin-top: 0; border-bottom: 1px solid var(--panel-border); padding-bottom: 10px;">Simulation State</h2>
327
+
328
+ <div class="metric-group" style="margin-top: 15px;">
329
+ <div class="metric-label">Step Count</div>
330
+ <div class="metric-value" id="val-step">0</div>
331
+ </div>
332
+
333
+ <div class="metric-group">
334
+ <div class="metric-label">Signal Phase</div>
335
+ <div class="metric-value" id="val-phase" style="color: #58a6ff;">NS GREEN</div>
336
+ </div>
337
+
338
+ <div style="flex: 1;"></div>
339
+
340
+ <h3 style="font-size: 0.9rem; color: var(--text-muted); margin-bottom: 10px;">Waiting Time Pressure</h3>
341
+ <div style="display: grid; grid-template-columns: 1fr 1fr; gap: 10px;">
342
+ <div>
343
+ <div style="font-size: 0.7rem; color: var(--text-muted);">NORTH</div>
344
+ <div id="wait-n" style="font-family: monospace; font-size: 1.2rem;">0.0</div>
345
+ </div>
346
+ <div>
347
+ <div style="font-size: 0.7rem; color: var(--text-muted);">SOUTH</div>
348
+ <div id="wait-s" style="font-family: monospace; font-size: 1.2rem;">0.0</div>
349
+ </div>
350
+ <div>
351
+ <div style="font-size: 0.7rem; color: var(--text-muted);">EAST</div>
352
+ <div id="wait-e" style="font-family: monospace; font-size: 1.2rem;">0.0</div>
353
+ </div>
354
+ <div>
355
+ <div style="font-size: 0.7rem; color: var(--text-muted);">WEST</div>
356
+ <div id="wait-w" style="font-family: monospace; font-size: 1.2rem;">0.0</div>
357
+ </div>
358
+ </div>
359
+ </div>
360
+
361
+ <!-- Center: Visualizer -->
362
+ <div class="visualizer">
363
+ <div class="road road-v"></div>
364
+ <div class="road road-h"></div>
365
+ <div class="intersection">
366
+ <div class="light light-n" id="light-n"></div>
367
+ <div class="light light-s" id="light-s"></div>
368
+ <div class="light light-e" id="light-e"></div>
369
+ <div class="light light-w" id="light-w"></div>
370
+ </div>
371
+
372
+ <div class="q-num qn-n" id="qn-n">N: 0</div>
373
+ <div class="q-num qn-s" id="qn-s">S: 0</div>
374
+ <div class="q-num qn-e" id="qn-e">E: 0</div>
375
+ <div class="q-num qn-w" id="qn-w">W: 0</div>
376
+
377
+ <div class="queue-container queue-n" id="q-n"></div>
378
+ <div class="queue-container queue-s" id="q-s"></div>
379
+ <div class="queue-container queue-e" id="q-e"></div>
380
+ <div class="queue-container queue-w" id="q-w"></div>
381
+ </div>
382
+
383
+ <!-- Right Panel: Metrics & Controls -->
384
+ <div class="panel">
385
+ <h2 style="font-size: 1.1rem; margin-top: 0; border-bottom: 1px solid var(--panel-border); padding-bottom: 10px;">Metrics</h2>
386
+
387
+ <div class="metric-group" style="margin-top: 15px;">
388
+ <div class="metric-label">Total Cleared</div>
389
+ <div class="metric-value good" id="val-cleared">0</div>
390
+ </div>
391
+
392
+ <div class="metric-group">
393
+ <div class="metric-label">Fairness Score</div>
394
+ <div class="metric-value" id="val-fairness">1.00</div>
395
+ </div>
396
+
397
+ <div class="metric-group">
398
+ <div class="metric-label">Congestion Base</div>
399
+ <div class="metric-value warn" id="val-congestion">0.00</div>
400
+ </div>
401
+
402
+ <div class="controls">
403
+ <div class="toggle-container">
404
+ <span style="font-weight: 600;">Agent Auto-Mode</span>
405
+ <label style="position: relative; display: inline-block; width: 40px; height: 20px;">
406
+ <input type="checkbox" id="auto-play" style="opacity: 0; width: 0; height: 0;">
407
+ <span style="position: absolute; cursor: pointer; top: 0; left: 0; right: 0; bottom: 0; background-color: rgba(255,255,255,0.1); transition: .4s; border-radius: 20px; border: 1px solid var(--panel-border);" id="toggle-slider"></span>
408
+ </label>
409
+ </div>
410
+
411
+ <button onclick="doStep(0)">Keep Phase (0)</button>
412
+ <button class="primary" onclick="doStep(1)">Switch Phase (1)</button>
413
+ <button class="danger" onclick="doReset()" style="margin-top: 10px;">Reset Env</button>
414
+ </div>
415
+ </div>
416
+
417
+ <div id="toast-container"></div>
418
+
419
+ <script>
420
+ let autoPlayInterval = null;
421
+
422
+ document.getElementById('auto-play').addEventListener('change', function(e) {
423
+ const slider = document.getElementById('toggle-slider');
424
+ if (e.target.checked) {
425
+ slider.style.backgroundColor = 'var(--accent-color)';
426
+ autoPlayInterval = setInterval(() => {
427
+ doAutoStep();
428
+ }, 300);
429
+ showToast('Agent Auto-Mode Enabled');
430
+ } else {
431
+ slider.style.backgroundColor = 'rgba(255,255,255,0.1)';
432
+ if (autoPlayInterval) {
433
+ clearInterval(autoPlayInterval);
434
+ autoPlayInterval = null;
435
+ }
436
+ showToast('Manual Control Restored');
437
+ }
438
+ });
439
+
440
+ function showToast(msg) {
441
+ const container = document.getElementById('toast-container');
442
+ const toast = document.createElement('div');
443
+ toast.className = 'toast';
444
+ toast.innerText = msg;
445
+ container.appendChild(toast);
446
+ setTimeout(() => {
447
+ toast.style.opacity = '0';
448
+ setTimeout(() => toast.remove(), 300);
449
+ }, 2000);
450
+ }
451
+
452
+ function updateUI(data) {
453
+ const state = data.state;
454
+ const info = data.info || {};
455
+
456
+ // Update State Top
457
+ document.getElementById('val-step').innerText = state.step_count;
458
+
459
+ const pText = state.phase === 0 ? "NS GREEN" : "EW GREEN";
460
+ const pColor = state.phase === 0 ? "var(--green-light)" : "var(--accent-color)";
461
+ const pEl = document.getElementById('val-phase');
462
+ pEl.innerText = pText;
463
+ pEl.style.color = pColor;
464
+
465
+ // Lights
466
+ if (state.phase === 0) {
467
+ document.getElementById('light-n').className = 'light light-n green';
468
+ document.getElementById('light-s').className = 'light light-s green';
469
+ document.getElementById('light-e').className = 'light light-e red';
470
+ document.getElementById('light-w').className = 'light light-w red';
471
+ } else {
472
+ document.getElementById('light-n').className = 'light light-n red';
473
+ document.getElementById('light-s').className = 'light light-s red';
474
+ document.getElementById('light-e').className = 'light light-e green';
475
+ document.getElementById('light-w').className = 'light light-w green';
476
+ }
477
+
478
+ // Waiting
479
+ document.getElementById('wait-n').innerText = (state.waiting_times.north || 0).toFixed(1);
480
+ document.getElementById('wait-s').innerText = (state.waiting_times.south || 0).toFixed(1);
481
+ document.getElementById('wait-e').innerText = (state.waiting_times.east || 0).toFixed(1);
482
+ document.getElementById('wait-w').innerText = (state.waiting_times.west || 0).toFixed(1);
483
+
484
+ // Queues numbers
485
+ document.getElementById('qn-n').innerText = `N: ${state.north_cars}`;
486
+ document.getElementById('qn-s').innerText = `S: ${state.south_cars}`;
487
+ document.getElementById('qn-e').innerText = `E: ${state.east_cars}`;
488
+ document.getElementById('qn-w').innerText = `W: ${state.west_cars}`;
489
+
490
+ // Draw Cars
491
+ const drawQueue = (id, count, hasEV) => {
492
+ const q = document.getElementById(id);
493
+ q.innerHTML = '';
494
+ const displayCount = Math.min(count, 10);
495
+ for(let i=0; i<displayCount; i++) {
496
+ const car = document.createElement('div');
497
+ car.className = 'car';
498
+ // Make the first car emergency if flag is true
499
+ if (i === 0 && hasEV) car.classList.add('emergency');
500
+ q.appendChild(car);
501
+ }
502
+ };
503
+
504
+ const ev = state.emergency_flags;
505
+ drawQueue('q-n', state.north_cars, ev.north);
506
+ drawQueue('q-s', state.south_cars, ev.south);
507
+ drawQueue('q-e', state.east_cars, ev.east);
508
+ drawQueue('q-w', state.west_cars, ev.west);
509
+
510
+ // Audio Visuals (Metrics)
511
+ if (info.total_cleared !== undefined) {
512
+ document.getElementById('val-cleared').innerText = info.total_cleared;
513
+ document.getElementById('val-fairness').innerText = (info.fairness_score || 0).toFixed(2);
514
+ document.getElementById('val-congestion').innerText = (info.congestion_score || 0).toFixed(2);
515
+ }
516
+
517
+ if (data.done) {
518
+ showToast(`Episode Finished! Score: ${info.total_cleared}`);
519
+ if (document.getElementById('auto-play').checked) {
520
+ setTimeout(doReset, 1000);
521
+ }
522
+ }
523
+ }
524
+
525
+ async function doReset() {
526
+ try {
527
+ const res = await fetch('/reset', { method: 'POST' });
528
+ const data = await res.json();
529
+ updateUI(data);
530
+ showToast("Environment Reset");
531
+ } catch(e) { showToast("Error connecting to API"); }
532
+ }
533
+
534
+ async function doStep(action) {
535
+ try {
536
+ const res = await fetch('/step', {
537
+ method: 'POST',
538
+ headers: { 'Content-Type': 'application/json' },
539
+ body: JSON.stringify({ action: action })
540
+ });
541
+ const data = await res.json();
542
+ updateUI(data);
543
+ } catch(e) { }
544
+ }
545
+
546
+ async function doAutoStep() {
547
+ try {
548
+ const res = await fetch('/auto_step', { method: 'POST' });
549
+ const data = await res.json();
550
+ updateUI(data);
551
+ if (data.action_taken === 1) {
552
+ showToast("Agent triggered phase switch");
553
+ }
554
+ } catch(e) {
555
+ document.getElementById('auto-play').click(); // turn off
556
+ showToast("Agent step failed");
557
+ }
558
+ }
559
+
560
+ // Initial Load
561
+ doReset();
562
+ </script>
563
+ </body>
564
+ </html>
inference.py ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ from fastapi import FastAPI
3
+ from fastapi.responses import HTMLResponse
4
+ from pydantic import BaseModel
5
+ from env import TrafficEnv
6
+ from tasks import get_config
7
+ from baseline_agent import RuleBasedAgent
8
+ import os
9
+
10
+ app = FastAPI()
11
+ env = TrafficEnv(get_config("medium"))
12
+ agent = RuleBasedAgent()
13
+
14
+ class Action(BaseModel):
15
+ action: int
16
+
17
+ @app.get("/", response_class=HTMLResponse)
18
+ def root():
19
+ with open("index.html", "r", encoding="utf-8") as f:
20
+ return f.read()
21
+
22
+ @app.post("/reset")
23
+ def reset():
24
+ state = env.reset()
25
+ try:
26
+ state = state.tolist()
27
+ except:
28
+ pass
29
+ agent.reset()
30
+ return {"state":state}
31
+
32
+ @app.post("/step")
33
+ def step(data:Action):
34
+ state,reward,done,info = env.step(data.action)
35
+ try:
36
+ state = state.tolist()
37
+ except:
38
+ pass
39
+ return {
40
+ "state":state,
41
+ "reward":reward,
42
+ "done":done,
43
+ "info":info
44
+ }
45
+
46
+ @app.post("/auto_step")
47
+ def auto_step():
48
+ state_dict = env.get_state()
49
+ action = agent.select_action(state_dict)
50
+ state,reward,done,info = env.step(action)
51
+ try:
52
+ state = state.tolist()
53
+ except:
54
+ pass
55
+ return {
56
+ "state":state,
57
+ "reward":reward,
58
+ "done":done,
59
+ "info":info,
60
+ "action_taken": action
61
+ }
openenv.yaml ADDED
@@ -0,0 +1,182 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ version: "1.0"
2
+ name: "TrafficSignalOptimization-v1"
3
+ description: >
4
+ AI-driven Traffic Signal Optimization for a 4-way urban intersection.
5
+ A reinforcement-learning environment that challenges agents to minimise
6
+ congestion, reduce average waiting time, respond to emergency vehicles,
7
+ and maintain signal stability across three difficulty tiers.
8
+
9
+ author: "OpenEnv Submission"
10
+ tags:
11
+ - Reinforcement Learning
12
+ - Traffic Control
13
+ - Smart Cities
14
+ - Safety-Critical
15
+ - Emergency Vehicle Priority
16
+ licence: MIT
17
+
18
+ # ─────────────────────────────────────────────────────────────────────
19
+ # Environment specification
20
+ # ─────────────────────────────────────────────────────────────────────
21
+ environment:
22
+ class: "env.TrafficEnv"
23
+ entry_point: "env:TrafficEnv"
24
+
25
+ state_space:
26
+ type: Dict
27
+ keys:
28
+ north_cars:
29
+ type: Discrete
30
+ description: "Queued vehicles in the North lane"
31
+ range: [0, max_queue]
32
+ south_cars:
33
+ type: Discrete
34
+ description: "Queued vehicles in the South lane"
35
+ range: [0, max_queue]
36
+ east_cars:
37
+ type: Discrete
38
+ description: "Queued vehicles in the East lane"
39
+ range: [0, max_queue]
40
+ west_cars:
41
+ type: Discrete
42
+ description: "Queued vehicles in the West lane"
43
+ range: [0, max_queue]
44
+ waiting_times:
45
+ type: "Dict[str, float]"
46
+ description: "Cumulative waiting-time pressure per lane (north/south/east/west)"
47
+ phase:
48
+ type: Discrete
49
+ values: [0, 1]
50
+ description: "Current green signal: 0 = NS green, 1 = EW green"
51
+ emergency_flags:
52
+ type: "Dict[str, bool]"
53
+ description: "True if an emergency vehicle is present in that lane"
54
+ step_count:
55
+ type: Discrete
56
+ description: "Current step within the episode"
57
+ range: [0, max_steps]
58
+
59
+ action_space:
60
+ type: Discrete
61
+ n: 2
62
+ actions:
63
+ 0: "Keep current signal phase"
64
+ 1: "Switch signal phase (NS ↔ EW)"
65
+
66
+ observation_vector_dim: 14 # flat numpy array for RL frameworks
67
+ # Layout: [N, S, E, W queues | N, S, E, W waits | N, S, E, W EV flags | phase, step]
68
+
69
+ # ─────────────────────────────────────────────────────────────────────
70
+ # Reward design (multi-component, clipped to [-1, +1])
71
+ # ─────────────────────────────────────────────────────────────────────
72
+ reward:
73
+ range: [-1.0, 1.0]
74
+ components:
75
+ efficiency:
76
+ sign: "+"
77
+ description: "Vehicles cleared this step (throughput reward)"
78
+ congestion:
79
+ sign: "-"
80
+ description: "Normalised total queue density"
81
+ max_queue_penalty:
82
+ sign: "-"
83
+ description: "Penalty for extreme bottlenecks in any single lane"
84
+ switch_penalty:
85
+ sign: "-"
86
+ description: "Stability constraint to prevent oscillatory signal toggling"
87
+ improvement_bonus:
88
+ sign: "+"
89
+ description: "Bonus for active decongestion progress"
90
+ fairness_bonus:
91
+ sign: "+"
92
+ description: "Reward for maintaining balanced waiting times across all lanes"
93
+ starvation_penalty:
94
+ sign: "-"
95
+ description: "Penalty for phase-duration exceeding starvation limit"
96
+ emergency_priority:
97
+ sign: "+/-"
98
+ description: "Combo of golden-window bonus and delay penalty for EVs"
99
+
100
+ # ─────────────────────────────────────────────────────────────────────
101
+ # Difficulty modes
102
+ # ─────────────────────────────────────────────────────────────────────
103
+ difficulty_modes:
104
+ easy:
105
+ arrival_rate: [0, 1]
106
+ discharge_rate: [4, 5]
107
+ max_queue: 15
108
+ max_steps: 50
109
+ emergency_prob: 0.01
110
+ burst_prob: 0.0
111
+ description: "Stable, balanced traffic. Minimal emergencies. Ideal for learning."
112
+
113
+ medium:
114
+ arrival_rate: [1, 3]
115
+ discharge_rate: [3, 5]
116
+ max_queue: 25
117
+ max_steps: 100
118
+ emergency_prob: 0.05
119
+ burst_prob: 0.10
120
+ description: "Random traffic bursts, moderate congestion, occasional emergencies."
121
+
122
+ hard:
123
+ arrival_rate: [2, 5]
124
+ discharge_rate: [2, 4]
125
+ max_queue: 40
126
+ max_steps: 200
127
+ emergency_prob: 0.15
128
+ burst_prob: 0.20
129
+ description: "High-intensity traffic, frequent emergencies, strict fairness constraints."
130
+
131
+ # ─────────────────────────────────────────────────────────────────────
132
+ # Evaluation metrics (returned in info dict on every step)
133
+ # ─────────────────────────────────────────────────────────────────────
134
+ metrics:
135
+ total_cleared:
136
+ type: int
137
+ description: "Total vehicles discharged from the intersection (episode)"
138
+ avg_waiting_time:
139
+ type: float
140
+ description: "Cumulative wait pressure divided by vehicles cleared"
141
+ max_queue_length:
142
+ type: int
143
+ description: "Peak queue length observed in any lane (episode)"
144
+ signal_switch_count:
145
+ type: int
146
+ description: "Total signal changes (lower = more stable)"
147
+ congestion_score:
148
+ type: float
149
+ range: [0.0, 1.0]
150
+ description: "Current normalised total queue depth"
151
+ avg_ev_clear_time:
152
+ type: float
153
+ description: "Average steps taken to clear an emergency vehicle"
154
+ fairness_score:
155
+ type: float
156
+ range: [0.0, 1.0]
157
+ description: "Index representing lane-level service balance"
158
+
159
+ # ─────────────────────────────────────────────────────────────────────
160
+ # Baseline agent
161
+ # ─────────────────────────────────────────────────────────────────────
162
+ baseline:
163
+ class: "baseline_agent.RuleBasedAgent"
164
+ description: >
165
+ Deterministic rule-based agent. Switches based on queue imbalance,
166
+ minimum green time, starvation guard, and emergency preemption.
167
+ parameters:
168
+ min_green_time: 5
169
+ imbalance_threshold: 5
170
+ max_green_time: 15
171
+ emergency_min_green: 2
172
+
173
+ # ─────────────────────────────────────────────────────────────────────
174
+ # Project files
175
+ # ─────────────────────────────────────────────────────────────────────
176
+ project_structure:
177
+ - env.py: "Core TrafficEnv class"
178
+ - tasks.py: "Easy / Medium / Hard configuration dicts"
179
+ - baseline_agent.py: "Rule-based baseline agent"
180
+ - test_env.py: "Simulation runner and correctness checks"
181
+ - openenv.yaml: "This file — environment specification"
182
+ - README.md: "Full documentation"
pyproject.toml ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [project]
2
+ name = "traffic-signal-openenv"
3
+ version = "0.1.0"
4
+ description = "Traffic Signal Optimization - OpenEnv Elite"
5
+ readme = "README.md"
6
+ requires-python = ">=3.10"
7
+ dependencies = [
8
+ "fastapi>=0.100.0",
9
+ "uvicorn>=0.20.0",
10
+ "numpy>=1.20.0",
11
+ "pydantic>=2.0.0",
12
+ "openenv-core>=0.2.0",
13
+ ]
14
+
15
+ [project.scripts]
16
+ server = "server.app:main"
17
+
18
+ [build-system]
19
+ requires = ["hatchling"]
20
+ build-backend = "hatchling.build"
requirements.txt ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ fastapi
2
+ uvicorn
3
+ numpy
4
+ pydantic
server/__pycache__/app.cpython-313.pyc ADDED
Binary file (851 Bytes). View file
 
server/app.py ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import sys
3
+ import uvicorn
4
+
5
+ # Add the parent directory to sys.path so 'inference.py' can be imported and env modules
6
+ sys.path.append(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
7
+
8
+ from inference import app
9
+
10
+ def main():
11
+ uvicorn.run("server.app:app", host="0.0.0.0", port=7860)
12
+
13
+ if __name__ == "__main__":
14
+ main()
tasks.py ADDED
@@ -0,0 +1,161 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ tasks.py — Difficulty Configurations for TrafficEnv
3
+ =====================================================
4
+
5
+ Three pre-defined task configurations:
6
+
7
+ EASY_CONFIG – Stable, balanced traffic; good for initial training.
8
+ MEDIUM_CONFIG – Random bursts, moderate congestion; standard benchmark.
9
+ HARD_CONFIG – High intensity, frequent emergencies, strict fairness.
10
+
11
+ Each config is a plain dict consumed by TrafficEnv.__init__().
12
+ """
13
+
14
+ from __future__ import annotations
15
+ from typing import Any, Dict
16
+
17
+
18
+ # ---------------------------------------------------------------------------
19
+ # Easy
20
+ # ---------------------------------------------------------------------------
21
+
22
+ EASY_CONFIG: Dict[str, Any] = {
23
+ # Traffic flow
24
+ "arrival_rate": (0, 1), # 0–1 cars per lane per step
25
+ "discharge_rate": (4, 5), # 4–5 cars discharged per green lane per step
26
+ "max_queue": 15, # queue cap per lane
27
+ "max_steps": 50,
28
+
29
+ # Emergencies — rare
30
+ "emergency_prob": 0.01,
31
+
32
+ # Bursts — none
33
+ "burst_prob": 0.0,
34
+ "burst_multiplier": 1.0,
35
+
36
+ # Reward knobs
37
+ "switch_penalty": 0.10,
38
+ "starvation_threshold": 20,
39
+ "r_efficiency_scale": 0.20,
40
+ "p_congestion_scale": 0.30,
41
+ "p_max_q_scale": 0.10,
42
+ "p_starvation_scale": 0.10,
43
+ "r_fairness_bonus": 0.05,
44
+ "r_improvement_bonus": 0.15,
45
+ "p_emergency_scale": 0.30,
46
+ "r_ev_bonus_scale": 0.20,
47
+
48
+ # Logic thresholds
49
+ "ev_golden_window": 8, # Easy: very generous window
50
+ "ev_max_delay": 20,
51
+ }
52
+
53
+ # ---------------------------------------------------------------------------
54
+ # Medium
55
+ # ---------------------------------------------------------------------------
56
+
57
+ MEDIUM_CONFIG: Dict[str, Any] = {
58
+ # Traffic flow
59
+ "arrival_rate": (1, 3), # moderate, variable arrivals
60
+ "discharge_rate": (3, 5), # standard discharge
61
+ "max_queue": 25,
62
+ "max_steps": 100,
63
+
64
+ # Emergencies — occasional
65
+ "emergency_prob": 0.05,
66
+
67
+ # Random bursts — 10% chance, 1.5× arrivals
68
+ "burst_prob": 0.10,
69
+ "burst_multiplier": 1.5,
70
+
71
+ # Reward knobs
72
+ "switch_penalty": 0.20,
73
+ "starvation_threshold": 15,
74
+ "r_efficiency_scale": 0.20,
75
+ "p_congestion_scale": 0.40,
76
+ "p_max_q_scale": 0.15,
77
+ "p_starvation_scale": 0.15,
78
+ "r_fairness_bonus": 0.10,
79
+ "r_improvement_bonus": 0.20,
80
+ "p_emergency_scale": 0.40,
81
+ "r_ev_bonus_scale": 0.25,
82
+
83
+ # Logic thresholds
84
+ "ev_golden_window": 5, # Medium: standard window
85
+ "ev_max_delay": 15,
86
+ }
87
+
88
+ # ---------------------------------------------------------------------------
89
+ # Hard
90
+ # ---------------------------------------------------------------------------
91
+
92
+ HARD_CONFIG: Dict[str, Any] = {
93
+ # Traffic flow — high intensity
94
+ "arrival_rate": (2, 5), # heavy, bursty arrivals
95
+ "discharge_rate": (2, 4), # reduced discharge (lane friction)
96
+ "max_queue": 40,
97
+ "max_steps": 200,
98
+
99
+ # Emergencies — frequent
100
+ "emergency_prob": 0.15,
101
+
102
+ # Frequent aggressive bursts
103
+ "burst_prob": 0.20,
104
+ "burst_multiplier": 2.0,
105
+
106
+ # Reward knobs — stricter penalties
107
+ "switch_penalty": 0.30,
108
+ "starvation_threshold": 10, # stricter fairness
109
+ "r_efficiency_scale": 0.25,
110
+ "p_congestion_scale": 0.50,
111
+ "p_max_q_scale": 0.20,
112
+ "p_starvation_scale": 0.20,
113
+ "r_fairness_bonus": 0.15,
114
+ "r_improvement_bonus": 0.25,
115
+ "p_emergency_scale": 0.60, # amplified emergency penalty
116
+ "r_ev_bonus_scale": 0.30,
117
+
118
+ # Logic thresholds
119
+ "ev_golden_window": 3, # Hard: must clear immediately
120
+ "ev_max_delay": 10,
121
+ }
122
+
123
+
124
+ # ---------------------------------------------------------------------------
125
+ # Accessor
126
+ # ---------------------------------------------------------------------------
127
+
128
+ _CONFIGS = {
129
+ "easy": EASY_CONFIG,
130
+ "medium": MEDIUM_CONFIG,
131
+ "hard": HARD_CONFIG,
132
+ }
133
+
134
+
135
+ def get_config(mode: str) -> Dict[str, Any]:
136
+ """
137
+ Return the config dict for the requested difficulty mode.
138
+
139
+ Parameters
140
+ ----------
141
+ mode : str
142
+ One of "easy", "medium", "hard" (case-insensitive).
143
+
144
+ Returns
145
+ -------
146
+ dict
147
+ Configuration dictionary suitable for ``TrafficEnv(config)``.
148
+
149
+ Raises
150
+ ------
151
+ ValueError
152
+ If an unknown mode is requested.
153
+ """
154
+ key = mode.strip().lower()
155
+ if key not in _CONFIGS:
156
+ raise ValueError(
157
+ f"Unknown difficulty mode '{mode}'. "
158
+ f"Choose one of: {list(_CONFIGS)}"
159
+ )
160
+ # Return a copy so callers can mutate without side-effects
161
+ return dict(_CONFIGS[key])
uv.lock ADDED
The diff for this file is too large to render. See raw diff