Spaces:

arrow072
/

open_env_meta

Sleeping

App Files Files Community

open_env_meta / README.md

arrow072

Upload 14 files

5516cba verified about 1 month ago

preview code

raw

history blame contribute delete

5.91 kB

	---
	title: Traffic Signal Optimization — OpenEnv Elite
	emoji: 🚦
	colorFrom: blue
	colorTo: green
	sdk: docker
	app_port: 7860
	pinned: false
	---

	# 🚥 Traffic Signal Optimization — OpenEnv Elite

	> Meta × PyTorch OpenEnv Hackathon Submission
	>
	> A world-class Reinforcement Learning environment for urban traffic control, featuring stochastic multi-lane dynamics, emergency vehicle prioritization, and sophisticated fairness-driven rewards.

	---

	## 🏗️ Problem Statement

	Fixed-cycle traffic signals are a relic of the past. In modern urban environments, they create needless congestion, increase CO2 emissions, and — most critically — cause life-threatening delays for emergency vehicles.

	This project provides a high-fidelity 4-way intersection simulation designed for OpenEnv. It challenges RL agents to move beyond simple throughput and master the art of dynamic balancing: serving high-demand lanes while maintaining fairness for low-traffic directions and clearing "Golden Windows" for emergency responders.

	---

	## 🚀 Quick Start

	```bash
	# Run the complete suite: Simulation + Sanity Checks + Comparison
	python test_env.py

	# Run a specific high-intensity scenario
	python test_env.py hard
	```

	```python
	from env import TrafficEnv
	from tasks import get_config
	from baseline_agent import RuleBasedAgent

	# 1. Load a structured difficulty profile
	config = get_config("medium")
	env = TrafficEnv(config)

	# 2. Initialize our sophisticated Rule-Based Controller
	agent = RuleBasedAgent()

	state = env.reset()
	done = False

	while not done:
	action = agent.select_action(state)
	state, reward, done, info = env.step(action)

	print(f"Total Cleared: {info['total_cleared']}")
	print(f"Fairness Index: {info['fairness_score']:.2f}")
	```

	---

	## 🧠 Environment Design Philosophy

	### State Space
	The environment exposes a 14-dimensional continuous observation vector, providing the agent with full situational awareness:
	- Queues (4): Exact vehicle count per lane [N, S, E, W].
	- Wait Pressure (4): Cumulative "impatience" score per lane.
	- Emergency Flags (4): Binary detection of EVs per lane.
	- Signal State (2): Current phase [0=NS, 1=EW] and step count.

	### Action Space
	- `0`: Maintain — keep the current green phase.
	- `1`: Switch — transition the signal (includes yellow-phase discharge friction).

	---

	## 💎 Reward Engineering (The "Judge's Choice")

	Our reward function is the core of this submission. It isn't just a count; it's a multi-objective ethical framework clipped to `[-1, 1]`:

	\| Component \| Logic \| Purpose \|
	\| :--- \| :--- \| :--- \|
	\| Throughput (+) \| `+0.20 * cars_cleared` \| Incentivizes active vehicle flow. \|
	\| Density (-) \| `-0.40 * total_congestion` \| Penalizes letting the intersection fill up. \|
	\| Bottleneck (-) \| `-0.15 * max_queue` \| Discourages extreme build-up in any single lane. \|
	\| Stability (-) \| `-switch_penalty` \| Prevents "flickering" and promotes signal stability. \|
	\| Fairness (+/-) \| `+0.10` bonus / `-penalty` \| Rewards balanced service; penalizes starvation. \|
	\| Emergency (🚨) \| `Golden Window` Bonus \| Massive reward for clearing EVs within target steps. \|
	\| EV Delay (-) \| `Exponential Penalty` \| Punishes agents for delaying life-saving vehicles. \|

	---

	## 📊 Evaluation Metrics

	We track 8 key performance indicators per episode to ensure a winning submission can be quantified:

	1. Total Cleared: Raw efficiency metric.
	2. Avg Waiting Time: The "commuter frustration" index.
	3. Max Queue Length: Gauges system robustness against bottlenecks.
	4. Signal Switch Count: Measures policy stability.
	5. Congestion Score: Final system state snapshot.
	6. Avg EV Clear Time: Critical safety metric (lower is better).
	7. Fairness Score: [0, 1] index — how equally did we serve all lanes?
	8. Total EV Penalty: Measures total failure to prioritize safety.

	---

	## ⚡ Task Difficulty Levels

	\| Parameter \| Easy \| Medium \| Hard \|
	\| :--- \| :--- \| :--- \| :--- \|
	\| Arrival Rate \| 0–1 \| 1–3 \| 2–5 \|
	\| Discharge Rate \| 4–5 \| 3–5 \| 2–4 \|
	\| Burst Frequency \| 0% \| 10% \| 20% \|
	\| Emergency Prob \| 1% \| 5% \| 15% \|
	\| EV Golden Window \| 8 steps \| 5 steps \| 3 steps \|
	\| Fairness Limit \| 20 steps \| 15 steps \| 10 steps \|

	---

	## 🚑 Emergency & Fairness Logic

	### The "Golden Window"
	When an Emergency Vehicle (EV) appears, the agent is granted a bonus if it switches and clears the lane within the Golden Window (defined per difficulty). Failing to do so triggers an exponential delay penalty, simulating the real-world cost of stopping an ambulance or fire truck.

	### Fairness Guard
	To prevent "Starvation" (where the agent ignores a low-traffic lane to optimize throughput on a high-traffic lane), a Fairness Score is calculated. If a lane remains red beyond the Starvation Limit, the agent suffers a heavy penalty. This forces the agent to learn the complex trade-off between total throughput and social fairness.

	---

	## 🚶 Step Walkthrough

	```text
	Step 12: 🚨 Ambulance detected in East lane (currently RED).
	- EW Queue: 4, EV Timer: 0
	- Agent receives p_emergency penalty.

	Step 13: Agent Action: 1 (SWITCH to EW).
	- Switch penalty applied (-0.20).
	- NS lanes stop; EW lanes turn GREEN.

	Step 14: EV Cleared!
	- EV Clear Time: 2 steps.
	- Agent receives r_ev_bonus (+0.25) for "Golden Window" clearance.
	- Total cleared (+0.60 reward).
	```

	---

	## 🔮 Future Improvements

	- Multi-Intersection Coordination: Extending to a grid of agents using MARL.
	- Pedestrian Logic: Adding crosswalks and pedestrian priority.
	- V2X Communication: Providing agents with ahead-of-time traffic predictions.

	---

	## 📜 License

	MIT © 2026 Meta x PyTorch OpenEnv Hackathon