oozan commited on
Commit
a283aa6
·
verified ·
1 Parent(s): 8d86a90

Upload folder using huggingface_hub

Browse files
Dockerfile ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.10-slim
2
+
3
+ WORKDIR /app
4
+
5
+ COPY requirements.txt .
6
+ RUN pip install --no-cache-dir -r requirements.txt
7
+
8
+ COPY . .
9
+
10
+ CMD ["python", "run_evaluation.py"]
README.md ADDED
@@ -0,0 +1,138 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <h1 align="center">⚡ EVChargeEnv</h1>
2
+ <p align="center">
3
+ <img src="assets/evchargeenv-banner.png" width="800" />
4
+ </p>
5
+
6
+ <h3 align="center">Green Agent Benchmark for EV Charging Optimization</h3>
7
+
8
+ ---
9
+
10
+ ## Overview
11
+
12
+ EVChargeEnv is a lightweight, stochastic reinforcement-learning environment designed for the
13
+ AgentX + AgentBeats Competition (Berkeley RDI 2025).
14
+
15
+ It simulates:
16
+
17
+ - Electric vehicle battery charging
18
+ - Dynamic electricity pricing
19
+ - Fluctuating grid load
20
+ - Continuous control actions
21
+ - Multi-objective tradeoffs (cost vs. speed vs. grid stability)
22
+
23
+ ---
24
+
25
+ ## Task Goal
26
+
27
+ The purple agent must:
28
+
29
+ - Charge the EV battery to full (1.0)
30
+ - Minimize electricity cost
31
+ - Avoid high grid load
32
+ - Adapt to changing conditions
33
+
34
+ ---
35
+
36
+ ## State Space (Observation)
37
+
38
+ The agent receives:
39
+
40
+ charge_level (0-1), price (0-1), grid_load (0-1), time_step_norm (0-1)
41
+
42
+ ---
43
+
44
+ ## Action Space
45
+
46
+ Continuous charge rate 0.0 → 1.0.
47
+
48
+ ---
49
+
50
+ ## Reward Function
51
+
52
+ Reward combines:
53
+
54
+ - progress_reward
55
+
56
+ * cost_penalty
57
+ * overload_penalty
58
+ * time_penalty
59
+
60
+ ---
61
+
62
+ ## Scenarios
63
+
64
+ easy / medium / hard difficulty with different volatility and load patterns.
65
+
66
+ ---
67
+
68
+ ## Episode Termination
69
+
70
+ Ends if full charge or max steps reached.
71
+
72
+ ---
73
+
74
+ ## Example Agent Behaviors
75
+
76
+ Greedy agent = fast but expensive
77
+ Price-aware agent = slower but cheaper
78
+ Random agent = unstable
79
+
80
+ ---
81
+
82
+ ## Evaluation Output
83
+
84
+ Running:
85
+
86
+ python run_evaluation.py
87
+
88
+ Generates JSON like:
89
+
90
+ {
91
+ "avg_reward": ...,
92
+ "avg_steps": ...,
93
+ "episodes": 5
94
+ }
95
+
96
+ ---
97
+
98
+ ## Docker Support
99
+
100
+ Image: oozan/evchargeenv:latest
101
+
102
+ ---
103
+
104
+ ## File Structure
105
+
106
+ env/
107
+ agent/
108
+ run_evaluation.py
109
+ Dockerfile
110
+ requirements.txt
111
+ README.md
112
+
113
+ ---
114
+
115
+ ## Future Improvements
116
+
117
+ - renewable energy factor
118
+ - blackout events
119
+ - degradation model
120
+ - RL baseline
121
+ - trajectory visualizer
122
+ - mini-game UI
123
+
124
+ ## Benchmark Specification
125
+
126
+ This repository also includes a machine-readable benchmark manifest:
127
+
128
+ - `evchargeenv_manifest.json`
129
+
130
+ It documents:
131
+
132
+ - state and action spaces
133
+ - reward components
134
+ - termination conditions
135
+ - supported scenarios (`easy`, `medium`, `hard`)
136
+ - evaluation output format (JSON fields)
137
+
138
+ This makes EVChargeEnv easier to integrate as a standardized benchmark and aligns with the spirit of the OpenEnv challenge: environments that are transparent, reproducible, and extensible.
agent/__pycache__/baseline_agent.cpython-312.pyc ADDED
Binary file (728 Bytes). View file
 
agent/__pycache__/price_aware_agent.cpython-312.pyc ADDED
Binary file (2.02 kB). View file
 
agent/_init_.py ADDED
File without changes
agent/baseline_agent.py ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ import numpy as np
2
+
3
+ class BaselineAgent:
4
+ def select_action(self, observation):
5
+ return np.array([np.random.random()], dtype=np.float32)
agent/price_aware_agent.py ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import numpy as np
2
+
3
+
4
+ class PriceAwareAgent:
5
+ """
6
+ Heuristic agent for EVChargeEnv.
7
+
8
+ - Charges more when price is low and grid load is safe.
9
+ - Charges less when price is high or grid load is high.
10
+ """
11
+
12
+ def __init__(self,
13
+ low_price_threshold: float = 0.4,
14
+ high_price_threshold: float = 0.7,
15
+ high_load_threshold: float = 0.8):
16
+ self.low_price_threshold = low_price_threshold
17
+ self.high_price_threshold = high_price_threshold
18
+ self.high_load_threshold = high_load_threshold
19
+
20
+ def select_action(self, observation):
21
+ """
22
+ observation = [charge, price, load, time_step_norm]
23
+ returns: np.array([action]) in [0, 1]
24
+ """
25
+ charge, price, load, t = observation
26
+
27
+ # If almost full, stop charging.
28
+ if charge >= 0.98:
29
+ return np.array([0.0], dtype=np.float32)
30
+
31
+ # If grid is very stressed, back off.
32
+ if load >= self.high_load_threshold:
33
+ return np.array([0.1], dtype=np.float32)
34
+
35
+ # If price is low, charge aggressively.
36
+ if price <= self.low_price_threshold:
37
+ return np.array([0.9], dtype=np.float32)
38
+
39
+ # If price is very high, charge slowly, just enough to make progress.
40
+ if price >= self.high_price_threshold:
41
+ return np.array([0.2], dtype=np.float32)
42
+
43
+ # Medium case: moderate charging.
44
+ return np.array([0.5], dtype=np.float32)
env/__pycache__/ev_charge_env.cpython-312.pyc ADDED
Binary file (6.78 kB). View file
 
env/_init_.py ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from .ev_charge_env import EVChargeEnv
2
+
3
+
4
+ def register_env():
5
+ """
6
+ Register EVChargeEnv in an OpenEnv-compatible registry.
7
+ """
8
+ try:
9
+ import openenv
10
+ openenv.register(
11
+ id="EVChargeEnv-v0",
12
+ entry_point="env.ev_charge_env:EVChargeEnv",
13
+ )
14
+ print("EVChargeEnv-v0 registered successfully.")
15
+ except ImportError:
16
+ # OpenEnv not installed – safe fallback
17
+ pass
env/ev_charge_env.py ADDED
@@ -0,0 +1,167 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gymnasium as gym
2
+ from gymnasium import spaces
3
+ import numpy as np
4
+
5
+
6
+ class EVChargeEnv(gym.Env):
7
+ """
8
+ EV charging environment.
9
+
10
+ Goal:
11
+ - Reach full battery (charge = 1.0)
12
+ - Minimize cost
13
+ - Avoid stressing the grid
14
+
15
+ State (obs):
16
+ [charge_level, price, grid_load, time_step_norm]
17
+
18
+ Action:
19
+ continuous charging rate in [0.0, 1.0]
20
+ """
21
+
22
+ metadata = {"render_modes": ["human"]}
23
+
24
+ def __init__(self, max_steps: int = 48, scenario: str = "medium"):
25
+ super().__init__()
26
+
27
+ # Scenario difficulty
28
+ assert scenario in ["easy", "medium", "hard"]
29
+ self.scenario = scenario
30
+
31
+ # Observation: charge, price, load, time
32
+ self.observation_space = spaces.Box(
33
+ low=np.array([0.0, 0.0, 0.0, 0.0], dtype=np.float32),
34
+ high=np.array([1.0, 1.0, 1.0, 1.0], dtype=np.float32),
35
+ dtype=np.float32,
36
+ )
37
+
38
+ # Action: charge rate between 0 and 1
39
+ self.action_space = spaces.Box(
40
+ low=np.array([0.0], dtype=np.float32),
41
+ high=np.array([1.0], dtype=np.float32),
42
+ dtype=np.float32,
43
+ )
44
+
45
+ self.max_steps = max_steps
46
+ self.step_count = 0
47
+
48
+ # Internal state
49
+ self.charge = 0.0
50
+ self.price = 0.0
51
+ self.grid_load = 0.0
52
+
53
+ # Scenario parameters (set in reset)
54
+ self.base_price = 0.3
55
+ self.base_load = 0.5
56
+ self.load_threshold = 0.8 # above this → overload penalty
57
+ self.charge_rate_scale = 0.08 # how fast battery fills
58
+
59
+ def _set_scenario_params(self):
60
+ """Set parameters based on difficulty scenario."""
61
+ if self.scenario == "easy":
62
+ self.base_price = 0.25
63
+ self.base_load = 0.4
64
+ self.load_threshold = 0.9
65
+ self.charge_rate_scale = 0.10
66
+ elif self.scenario == "medium":
67
+ self.base_price = 0.30
68
+ self.base_load = 0.5
69
+ self.load_threshold = 0.85
70
+ self.charge_rate_scale = 0.08
71
+ else: # hard
72
+ self.base_price = 0.35
73
+ self.base_load = 0.6
74
+ self.load_threshold = 0.8
75
+ self.charge_rate_scale = 0.06
76
+
77
+ def reset(self, seed=None, options=None):
78
+ super().reset(seed=seed)
79
+ if seed is not None:
80
+ np.random.seed(seed)
81
+
82
+ self._set_scenario_params()
83
+
84
+ self.step_count = 0
85
+ # Random initial charge, slightly low
86
+ self.charge = np.random.uniform(0.1, 0.4)
87
+ # Start price/load around base with small noise
88
+ self.price = np.clip(self.base_price + np.random.normal(0, 0.05), 0.0, 1.0)
89
+ self.grid_load = np.clip(self.base_load + np.random.normal(0, 0.05), 0.0, 1.0)
90
+
91
+ obs = self._get_obs()
92
+ return obs, {}
93
+
94
+ def _get_obs(self):
95
+ time_step_norm = self.step_count / max(1, self.max_steps - 1)
96
+ return np.array(
97
+ [self.charge, self.price, self.grid_load, time_step_norm],
98
+ dtype=np.float32,
99
+ )
100
+
101
+ def step(self, action):
102
+ self.step_count += 1
103
+
104
+ # Clamp action into valid range
105
+ a = float(np.clip(action[0], 0.0, 1.0))
106
+
107
+ # --- Dynamics ---
108
+ # Battery charging
109
+ self.charge += a * self.charge_rate_scale
110
+ self.charge = float(np.clip(self.charge, 0.0, 1.0))
111
+
112
+ # Price & load as noisy processes around base values
113
+ self.price = float(
114
+ np.clip(
115
+ self.price * 0.7
116
+ + self.base_price * 0.3
117
+ + np.random.normal(0, 0.05),
118
+ 0.0,
119
+ 1.0,
120
+ )
121
+ )
122
+ self.grid_load = float(
123
+ np.clip(
124
+ self.grid_load * 0.6
125
+ + self.base_load * 0.4
126
+ + np.random.normal(0, 0.07),
127
+ 0.0,
128
+ 1.0,
129
+ )
130
+ )
131
+
132
+ # --- Reward ---
133
+ # Progress reward
134
+ progress = a * self.charge_rate_scale
135
+ progress_reward = progress * 5.0 # scaled up
136
+
137
+ # Cost penalty (higher price * more charging = worse)
138
+ cost_penalty = self.price * a * 4.0
139
+
140
+ # Grid overload penalty if we charge too much when load is high
141
+ effective_load = self.grid_load + a * 0.2
142
+ overload = max(0.0, effective_load - self.load_threshold)
143
+ overload_penalty = overload * 6.0
144
+
145
+ # Small time penalty to encourage faster completion
146
+ time_penalty = 0.01
147
+
148
+ reward = progress_reward - cost_penalty - overload_penalty - time_penalty
149
+
150
+ # Episode done?
151
+ terminated = self.charge >= 0.999
152
+ truncated = self.step_count >= self.max_steps
153
+
154
+ obs = self._get_obs()
155
+ info = {
156
+ "progress_reward": progress_reward,
157
+ "cost_penalty": cost_penalty,
158
+ "overload_penalty": overload_penalty,
159
+ }
160
+
161
+ return obs, reward, terminated, truncated, info
162
+
163
+ def render(self):
164
+ print(
165
+ f"step={self.step_count} charge={self.charge:.3f} "
166
+ f"price={self.price:.3f} load={self.grid_load:.3f}"
167
+ )
evchargeenv_manifest.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "name": "EVChargeEnv",
3
+ "description": "An EV charging optimization benchmark environment for testing agents under dynamic prices and variable grid load.",
4
+ "version": "0.1.0",
5
+ "task_type": "continuous_control",
6
+ "domain": "energy_ev_charging",
7
+ "observation_space": {
8
+ "type": "Box",
9
+ "shape": [4],
10
+ "components": [
11
+ "charge_level (0-1)",
12
+ "price (0-1)",
13
+ "grid_load (0-1)",
14
+ "time_step_norm (0-1)"
15
+ ]
16
+ },
17
+ "action_space": {
18
+ "type": "Box",
19
+ "shape": [1],
20
+ "description": "continuous charging rate in [0, 1]"
21
+ },
22
+ "scenarios": ["easy", "medium", "hard"],
23
+ "reward_components": [
24
+ "progress_reward (battery increase)",
25
+ "cost_penalty (price * charge_rate)",
26
+ "overload_penalty (high grid load + high charging)",
27
+ "time_penalty (encourages faster completion)"
28
+ ],
29
+ "termination_conditions": [
30
+ "battery full (charge_level >= 1.0)",
31
+ "maximum step count reached"
32
+ ],
33
+ "evaluation_output": {
34
+ "format": "json",
35
+ "fields": ["avg_reward", "avg_steps", "episodes"]
36
+ }
37
+ }
openenv.yaml ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ id: EVChargeEnv-v0
2
+ name: EVChargeEnv
3
+ version: "0.1.0"
4
+ description: >
5
+ EVChargeEnv is a continuous-control electric vehicle charging environment
6
+ with dynamic pricing, fluctuating grid load, and multi-objective reward signals.
7
+ It is suitable for benchmarking agentic behavior and testing adaptation
8
+ to non-stationary conditions.
9
+
10
+ authors:
11
+ - name: Ozan Özayranci
12
+ github: "https://github.com/oozan"
13
+
14
+ license: mit
15
+
16
+ environment:
17
+ observation_space:
18
+ shape: [4]
19
+ type: box
20
+ description:
21
+ - charge_level (0–1)
22
+ - price (0–1)
23
+ - grid_load (0–1)
24
+ - time_step_norm (0–1)
25
+ action_space:
26
+ shape: [1]
27
+ type: box
28
+ description: continuous charge rate (0–1)
29
+ reward_components:
30
+ - progress_reward
31
+ - cost_penalty
32
+ - overload_penalty
33
+ - time_penalty
34
+ termination_conditions:
35
+ - charge >= 1.0
36
+ - max_steps reached
37
+
38
+ scenarios:
39
+ - easy
40
+ - medium
41
+ - hard
42
+
43
+ entry_point: env.ev_charge_env:EVChargeEnv
44
+
45
+ tags:
46
+ - energy
47
+ - control
48
+ - continuous
49
+ - stochastic
50
+ - reinforcement-learning
51
+ - openenv
requirements.txt ADDED
@@ -0,0 +1,2 @@
 
 
 
1
+ gymnasium
2
+ numpy
run_evaluation.py ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import json
2
+ from env.ev_charge_env import EVChargeEnv
3
+ from agent.baseline_agent import BaselineAgent
4
+
5
+ def run_episode(env, agent):
6
+ obs, _ = env.reset()
7
+ total_reward = 0.0
8
+ steps = 0
9
+
10
+ while True:
11
+ action = agent.select_action(obs)
12
+ obs, reward, terminated, truncated, _ = env.step(action)
13
+ total_reward += reward
14
+ steps += 1
15
+ if terminated or truncated or steps >= 200:
16
+ break
17
+
18
+ return total_reward, steps
19
+
20
+ def main():
21
+ env = EVChargeEnv()
22
+ agent = BaselineAgent()
23
+
24
+ rewards = []
25
+ steps_list = []
26
+
27
+ for _ in range(5):
28
+ total_reward, steps = run_episode(env, agent)
29
+ rewards.append(total_reward)
30
+ steps_list.append(steps)
31
+
32
+ output = {
33
+ "avg_reward": sum(rewards) / len(rewards),
34
+ "avg_steps": sum(steps_list) / len(steps_list),
35
+ "episodes": len(rewards)
36
+ }
37
+
38
+ print(json.dumps(output))
39
+
40
+ # Save JSON for reproducibility
41
+ with open("sample_output.json", "w") as f:
42
+ json.dump(output, f, indent=4)
43
+
44
+ if __name__ == "__main__":
45
+ main()
run_price_aware_evaluation.py ADDED
@@ -0,0 +1,48 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import json
2
+ from env.ev_charge_env import EVChargeEnv
3
+ from agent.price_aware_agent import PriceAwareAgent
4
+
5
+
6
+ def run_episode(env, agent, seed=None):
7
+ obs, _ = env.reset(seed=seed)
8
+ total_reward = 0.0
9
+ steps = 0
10
+
11
+ while True:
12
+ action = agent.select_action(obs)
13
+ obs, reward, terminated, truncated, _ = env.step(action)
14
+ total_reward += reward
15
+ steps += 1
16
+ if terminated or truncated:
17
+ break
18
+
19
+ return total_reward, steps
20
+
21
+
22
+ def main():
23
+ # You can change scenario to "easy" / "medium" / "hard"
24
+ env = EVChargeEnv(scenario="medium")
25
+ agent = PriceAwareAgent()
26
+
27
+ rewards = []
28
+ steps_list = []
29
+
30
+ num_episodes = 10
31
+ for i in range(num_episodes):
32
+ total_reward, steps = run_episode(env, agent, seed=i)
33
+ rewards.append(total_reward)
34
+ steps_list.append(steps)
35
+
36
+ output = {
37
+ "agent_type": "price_aware",
38
+ "scenario": "medium",
39
+ "avg_reward": sum(rewards) / len(rewards),
40
+ "avg_steps": sum(steps_list) / len(steps_list),
41
+ "episodes": num_episodes,
42
+ }
43
+
44
+ print(json.dumps(output, indent=2))
45
+
46
+
47
+ if __name__ == "__main__":
48
+ main()
sample_output.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "avg_reward": -9.145111057869848,
3
+ "avg_steps": 20.2,
4
+ "episodes": 5
5
+ }