77ethers commited on
Commit
d446b52
·
verified ·
1 Parent(s): bafd9fc

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +197 -71
README.md CHANGED
@@ -14,81 +14,176 @@ tags:
14
 
15
  # GridOps — Community Microgrid Bridge Operator
16
 
17
- An OpenEnv reinforcement learning environment where an AI agent operates a **100-home community microgrid** in an Indian city during summer. The agent must balance solar generation, battery storage, diesel backup, and grid trade to keep the lights on while minimizing cost and emissions.
18
 
19
- ## Why This Matters
20
 
21
- Community microgrid operation is a real job in India under the RDSS (Revamped Distribution Sector Scheme). IEX prosumer bidding is live. The tension between local energy independence and national grid economics creates genuine multi-day planning challenges that simple heuristics cannot solve.
22
 
23
- ## Action Space (3 continuous dimensions)
 
 
 
 
 
 
 
 
 
 
 
 
24
 
25
- | Action | Range | Description |
 
 
 
 
 
 
 
 
26
  |--------|-------|-------------|
27
- | `battery_dispatch` | -1.0 to +1.0 | Charge (-100 kW) or discharge (+100 kW) the community battery |
28
- | `diesel_dispatch` | 0.0 to 1.0 | Diesel generator output (0-100 kW). Rs 100 startup cost. |
29
- | `demand_shedding` | 0.0 to 1.0 | Request residents reduce usage (0-20%). 50% rebounds next hour. |
30
-
31
- **The grid is NOT an action** it automatically absorbs the residual (capped at ±200 kW). If demand exceeds all supply sources, that's a blackout.
32
-
33
- ## Observation Space
34
-
35
- | Field | Type | Description |
36
- |-------|------|-------------|
37
- | `hour` | float | Current hour in episode (0-72) |
38
- | `demand_kw` | float | Current aggregate demand (kW) |
39
- | `solar_kw` | float | Current solar generation (kW) |
40
- | `battery_soc` | float | Battery state-of-charge (0-1) |
41
- | `grid_price` | float | Current IEX price (Rs/kWh) |
42
- | `diesel_fuel_remaining` | float | Diesel fuel level (0-1) |
43
- | `diesel_is_on` | bool | Whether diesel was running last step |
44
- | `demand_forecast_4h` | list[4] | Noisy demand forecast (±15%) |
45
- | `solar_forecast_4h` | list[4] | Noisy solar forecast |
46
- | `price_forecast_4h` | list[4] | Noisy price forecast |
47
- | `cumulative_blackout_kwh` | float | Total unmet demand |
48
- | `cumulative_cost` | float | Net cost so far (Rs) |
49
- | `day_of_episode` | int | Current day (1-3) |
50
-
51
- ## Episode Structure
52
-
53
- - **Duration**: 3 days (72 hours, 1-hour steps)
54
- - **Why 3 days**: Day 1 = learn the cycle. Day 2 = heatwave hits. Day 3 = multi-day planning tested.
55
- - **Determinism**: Seeded RNG identical episodes per seed.
56
-
57
- ## Tasks (Easy Medium Hard)
58
-
59
- | Task | Conditions | Challenge |
60
- |------|-----------|-----------|
61
- | **Task 1: Normal Summer** | Clear skies, ~100 kW avg demand, Rs 3-12 prices | Battery arbitrage + evening peak management |
62
- | **Task 2: Heatwave + Clouds** | Day 2-3 heatwave (+30% demand), intermittent clouds, price spikes Rs 18 | Multi-day battery planning under uncertainty |
63
- | **Task 3: Extreme Crisis** | Full 3-day heatwave, -30% solar, +50% demand, Rs 8-20, limited diesel | Survival mode all resources constrained |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
64
 
65
  ## Grading (0.0 - 1.0)
66
 
67
  ```
68
- score = 0.50 × cost_efficiency + 0.25 × reliability + 0.25 × green_score
69
  ```
70
 
71
- - **Cost efficiency**: How much cheaper than a dumb "max grid import, no intelligence" baseline
72
- - **Reliability**: Fraction of demand met (blackouts penalized via Rs 150/kWh VoLL)
73
- - **Green score**: 1 - (diesel_used / total_demand)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
74
 
75
  ## Baseline Scores
76
 
77
  | Strategy | Task 1 | Task 2 | Task 3 |
78
  |----------|--------|--------|--------|
79
- | **Oracle** | 0.81 | 0.82 | 0.77 |
80
- | Do-Nothing | 0.58 | 0.51 | 0.46 |
81
- | Always-Discharge | 0.58 | 0.51 | 0.47 |
82
- | Always-Diesel | 0.42 | 0.42 | 0.45 |
 
 
 
 
 
 
83
 
84
  ## Key Physics
85
 
86
- - **Battery**: 500 kWh, 100 kW max, 90% round-trip efficiency, **Rs 2.5/kWh degradation cost**
87
- - **Diesel**: 100 kW, Rs 25/kWh, **Rs 100 startup cost** (penalizes on-off cycling)
88
- - **Demand shedding**: Up to 20%, but **50% rebounds next hour** (no free lunch)
89
- - **VoLL**: Rs 150/kWh blackout penalty (smooth gradient, no hard gate)
 
 
 
 
90
 
91
- ## Setup
 
 
 
 
 
 
 
 
 
92
 
93
  ```bash
94
  # Install
@@ -97,13 +192,13 @@ pip install -e .
97
  # Run server
98
  uvicorn gridops.server.app:app --port 8000
99
 
100
- # Dashboard
101
  open http://localhost:8000/dashboard/
102
 
103
- # Oracle test
104
  python scripts/oracle_test.py
105
 
106
- # Inference
107
  export API_BASE_URL="https://router.huggingface.co/v1"
108
  export HF_TOKEN="your-token"
109
  export MODEL_NAME="meta-llama/Llama-3.3-70B-Instruct"
@@ -117,26 +212,57 @@ docker build -t gridops .
117
  docker run -p 8000:8000 gridops
118
  ```
119
 
 
 
 
 
 
 
 
 
 
 
 
 
120
  ## Project Structure
121
 
122
  ```
123
  gridops/
124
- ├── inference.py # LLM baseline (uses OpenAI client)
125
- ├── openenv.yaml # OpenEnv manifest
126
- ├── Dockerfile
127
- ├── server/app.py # Root entry point (OpenEnv validate)
128
  ├── gridops/
129
- │ ├── models.py # GridOpsAction, GridOpsObservation (Pydantic)
130
  │ ├── simulation/
131
- │ │ ├── physics.py # Energy balance, battery, VoLL, degradation
132
- │ │ └── scenarios.py # Demand/solar/price curve generators
133
  │ ├── tasks/
134
- │ │ ├── definitions.py # 3 task configs
135
- │ │ └── graders.py # 0-1 grading with cost/reliability/green
136
  │ └── server/
137
- │ ├── app.py # FastAPI + OpenEnv create_app
138
- │ ├── environment.py # OpenEnv Environment class
139
- │ └── static/index.html # Interactive dashboard
140
  └── scripts/
141
- └── oracle_test.py # Oracle validation + determinism check
142
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
  # GridOps — Community Microgrid Bridge Operator
16
 
17
+ > Keep the lights on for 100 homes. Don't go broke. Don't pollute.
18
 
19
+ An OpenEnv RL environment where an AI agent operates a **community microgrid in an Indian city during summer**. Every hour for 3 days, the agent decides how to use the battery, whether to run diesel, and whether to ask residents to cut usage — while the national grid automatically covers whatever is left over (up to its 200 kW limit).
20
 
21
+ **Live dashboard**: [77ethers-gridops.hf.space/dashboard/](https://77ethers-gridops.hf.space/dashboard/)
22
 
23
+ ---
24
+
25
+ ## Why This Environment Exists
26
+
27
+ Community microgrid operation is a **real job** in India under the [RDSS](https://rdss.gov.in/) (Revamped Distribution Sector Scheme). IEX prosumer bidding is live. Over 50 million Indian homes will have rooftop solar by 2030, and someone needs to manage the battery-grid-diesel tradeoff in real time.
28
+
29
+ This environment captures the core tension: **sell surplus solar during expensive evening peaks for profit — but if a heatwave continues tomorrow and your battery is empty, the neighborhood goes dark.**
30
+
31
+ Simple heuristics ("always discharge at peak", "always run diesel when low") provably fail. The agent must learn multi-hour planning, price forecasting, and constraint management.
32
+
33
+ ---
34
+
35
+ ## The Problem at a Glance
36
 
37
+ **You have:**
38
+ - **Solar panels** — 250 kW peak, free, but only during daylight
39
+ - **Community battery** — 500 kWh storage, 100 kW max charge/discharge
40
+ - **Diesel generator** — 100 kW, but Rs 25/kWh + Rs 100 startup cost
41
+ - **National grid** — auto-imports/exports as slack (capped at 200 kW)
42
+
43
+ **You control (3 continuous actions):**
44
+
45
+ | Action | Range | What it does |
46
  |--------|-------|-------------|
47
+ | `battery_dispatch` | -1 to +1 | Charge (-100 kW) or discharge (+100 kW). Rs 2.5/kWh degradation. |
48
+ | `diesel_dispatch` | 0 to 1 | Diesel output (0-100 kW). Rs 25/kWh + Rs 100 startup if was off. |
49
+ | `demand_shedding` | 0 to 1 | Ask residents to cut 0-20% usage. **100% rebounds next hour.** Rs 40/kWh penalty. |
50
+
51
+ **You do NOT control the grid.** It automatically absorbs whatever energy gap remains after your decisions. If the gap exceeds 200 kW, that's a **blackout** (Rs 150/kWh penalty).
52
+
53
+ ---
54
+
55
+ ## The Critical Bottleneck
56
+
57
+ At **8 PM every evening**, demand hits **250 kW** but the grid maxes out at **200 kW** and solar is zero.
58
+
59
+ The **50 kW gap** must come from your battery. If you discharged it for profit during the day, the neighborhood goes dark.
60
+
61
+ On a heatwave day (Task 2-3), demand spikes to **325-375 kW**. Now the gap is **125-175 kW** — you need battery + diesel + shedding just to survive. And in Task 3, the grid goes down entirely for 6 hours.
62
+
63
+ ---
64
+
65
+ ## What the Agent Sees (Observation)
66
+
67
+ | Field | Description |
68
+ |-------|-------------|
69
+ | `hour` | Current hour in episode (0-72, starting 6 AM) |
70
+ | `demand_kw` | What the 100 homes need right now |
71
+ | `solar_kw` | Free solar power available (0 at night, up to 250 kW midday) |
72
+ | `battery_soc` | Battery charge level (0-1, i.e. 0-500 kWh) |
73
+ | `grid_price` | Current IEX electricity price (Rs 3-20/kWh) |
74
+ | `diesel_fuel_remaining` | Diesel tank level (0-1) |
75
+ | `diesel_is_on` | Was diesel running last step? (startup cost if turning on) |
76
+ | `demand_forecast_4h` | Noisy 4-hour demand forecast (+-15%) |
77
+ | `solar_forecast_4h` | Noisy 4-hour solar forecast |
78
+ | `price_forecast_4h` | Noisy 4-hour price forecast |
79
+ | `cumulative_blackout_kwh` | Total blackout energy so far |
80
+ | `cumulative_cost` | Total money spent so far (Rs) |
81
+ | `flow_*` | Detailed energy flows (solar, grid import/export, battery in/out, diesel, demand) |
82
+
83
+ **Partial observability**: forecasts have +-15% Gaussian noise. The agent cannot perfectly predict heatwave intensity, cloud cover, or price spikes.
84
+
85
+ ---
86
+
87
+ ## 3 Tasks (Each Tests a Different RL Capability)
88
+
89
+ ### Task 1: Normal Summer (Easy) — *Tests basic arbitrage*
90
+ - Clear skies, standard demand (~100 kW avg, 250 kW peak)
91
+ - Grid prices Rs 3-12 with clear cheap night / expensive evening pattern
92
+ - **What the agent must learn**: charge battery at night (cheap grid), discharge during evening peak (expensive grid), let solar cover midday
93
+
94
+ ### Task 2: Heatwave + Price Spike (Medium) — *Tests temporal planning*
95
+ - Day 2-3 heatwave (+30% demand), intermittent clouds
96
+ - **Rs 20 price spike** on Day 2 evening — visible in 4-hour forecast
97
+ - **What the agent must learn**: read the forecast, hold battery charge for the spike instead of greedily discharging early. A greedy policy discharges mid-afternoon; an RL agent that reads the forecast holds until 6 PM.
98
+
99
+ ### Task 3: Extreme Crisis + Grid Outage (Hard) — *Tests constraint management*
100
+ - Full 3-day heatwave, -30% solar from haze, +50% demand
101
+ - Limited diesel (33% tank = ~8 hours at full power)
102
+ - **6-hour grid outage** on Day 2 afternoon — grid cap drops to 0 kW
103
+ - **What the agent must learn**: aggressively pre-charge battery before the outage, ration diesel across the outage window, shed demand strategically to stretch resources. This is true microgrid islanding.
104
+
105
+ ---
106
 
107
  ## Grading (0.0 - 1.0)
108
 
109
  ```
110
+ score = 0.50 x cost_efficiency + 0.25 x reliability + 0.25 x green_score
111
  ```
112
 
113
+ | Component | Formula | What it rewards |
114
+ |-----------|---------|----------------|
115
+ | **Cost efficiency** (50%) | `1 - (agent_cost / baseline_cost)` | Spending less than a dumb "max grid import" baseline |
116
+ | **Reliability** (25%) | `(demand_met - blackout) / demand_met` | Keeping the lights on |
117
+ | **Green score** (25%) | `1 - (diesel_used / total_demand)` | Minimizing diesel emissions |
118
+
119
+ **Baseline**: "import max grid every hour, no battery/diesel/shedding" — physically possible, but expensive and suffers blackouts during peak hours and grid outages.
120
+
121
+ **VoLL (Value of Lost Load)**: Rs 150/kWh blackout penalty. This is a smooth gradient — no hard reliability cliff. The agent always gets signal for reducing blackouts incrementally.
122
+
123
+ ---
124
+
125
+ ## Why Heuristics Fail
126
+
127
+ | Strategy | Why it fails |
128
+ |----------|-------------|
129
+ | "Always discharge battery" | Empty by evening peak. 50 kW gap = blackout. Score collapses. |
130
+ | "Always run diesel" | Rs 25/kWh vs Rs 5 grid at night. Hemorrhages money. Green score = 0. |
131
+ | "Shed demand whenever short" | Rs 40/kWh cost + 100% rebounds next hour. More expensive than diesel. |
132
+ | "Discharge when price > X" | Ignores battery state. Drains SOC before the real peak. |
133
+ | "Do nothing" | Grid alone can't cover evening peak. 3.6% blackout rate. |
134
+
135
+ The oracle (rule-based, time-of-day + price-aware) scores 0.70-0.81. There's a clear **0.20-0.35 gap** between heuristics and the oracle, proving the environment has real optimization headroom.
136
+
137
+ ---
138
+
139
+ ## Anti-Gaming Design
140
+
141
+ The environment has 5 mechanisms that prevent reward hacking:
142
+
143
+ 1. **Shedding is expensive** — Rs 40/kWh + 100% rebound. Costlier than diesel. True emergency only.
144
+ 2. **Battery degradation** — Rs 2.5/kWh throughput. Prevents infinite cycling for tiny arbitrage.
145
+ 3. **Diesel startup cost** — Rs 100 per on-switch. Prevents on/off toggling.
146
+ 4. **VoLL is smooth** — Rs 150/kWh with no cliff. Agent can't exploit a binary gate.
147
+ 5. **Grid is capped** — 200 kW max (0 during outages). Can't just buy everything.
148
+
149
+ ---
150
 
151
  ## Baseline Scores
152
 
153
  | Strategy | Task 1 | Task 2 | Task 3 |
154
  |----------|--------|--------|--------|
155
+ | **Oracle (rule-based)** | **0.79** | **0.81** | **0.70** |
156
+ | Do-Nothing (grid only) | 0.58 | 0.51 | 0.45 |
157
+ | Always-Discharge | 0.59 | 0.51 | 0.45 |
158
+ | Always-Diesel | 0.42 | 0.42 | 0.44 |
159
+
160
+ - **Deterministic**: identical scores across 3 runs (seeded RNG)
161
+ - **Oracle ceiling is < 1.0**: proves the environment has real physics constraints, not inflated scores
162
+ - **Clear separation**: oracle >> heuristics on every task
163
+
164
+ ---
165
 
166
  ## Key Physics
167
 
168
+ | Component | Spec | Cost |
169
+ |-----------|------|------|
170
+ | **Solar** | 250 kW peak, bell curve 6 AM - 6 PM | Free |
171
+ | **Battery** | 500 kWh, 100 kW max, 90% round-trip (sqrt each way) | Rs 2.5/kWh degradation |
172
+ | **Diesel** | 100 kW max | Rs 25/kWh + Rs 100 startup |
173
+ | **Grid** | 200 kW max import/export (slack variable) | Market price Rs 3-20/kWh |
174
+ | **Blackout** | Unmet demand when all sources exhausted | Rs 150/kWh VoLL penalty |
175
+ | **Shedding** | Up to 20% demand reduction | Rs 40/kWh + 100% rebound next hour |
176
 
177
+ **Energy balance every step:**
178
+ ```
179
+ supply = solar + grid_import + battery_discharge + diesel
180
+ consume = effective_demand + grid_export + battery_charge
181
+ ```
182
+ Supply always equals consumption. Any unmet demand beyond grid cap = blackout.
183
+
184
+ ---
185
+
186
+ ## Setup & Usage
187
 
188
  ```bash
189
  # Install
 
192
  # Run server
193
  uvicorn gridops.server.app:app --port 8000
194
 
195
+ # Interactive dashboard
196
  open http://localhost:8000/dashboard/
197
 
198
+ # Validate oracle + determinism
199
  python scripts/oracle_test.py
200
 
201
+ # Run LLM baseline
202
  export API_BASE_URL="https://router.huggingface.co/v1"
203
  export HF_TOKEN="your-token"
204
  export MODEL_NAME="meta-llama/Llama-3.3-70B-Instruct"
 
212
  docker run -p 8000:8000 gridops
213
  ```
214
 
215
+ ## OpenEnv Validation
216
+
217
+ ```bash
218
+ # Local structure check
219
+ openenv validate
220
+
221
+ # Runtime check (against live server)
222
+ openenv validate --url http://localhost:8000
223
+ ```
224
+
225
+ ---
226
+
227
  ## Project Structure
228
 
229
  ```
230
  gridops/
231
+ ├── inference.py # LLM baseline (API_BASE_URL, MODEL_NAME, HF_TOKEN)
232
+ ├── openenv.yaml # OpenEnv manifest
233
+ ├── Dockerfile # Docker deployment
234
+ ├── server/app.py # Root entry point (openenv validate)
235
  ├── gridops/
236
+ │ ├── models.py # GridOpsAction, GridOpsObservation (Pydantic)
237
  │ ├── simulation/
238
+ │ │ ├── physics.py # Energy balance, battery, VoLL, degradation, outages
239
+ │ │ └── scenarios.py # Demand/solar/price curve generators
240
  │ ├── tasks/
241
+ │ │ ├── definitions.py # 3 task configs (normal, heatwave, crisis+outage)
242
+ │ │ └── graders.py # 0-1 scoring: cost + reliability + green
243
  │ └── server/
244
+ │ ├── app.py # FastAPI + OpenEnv create_app
245
+ │ ├── environment.py # OpenEnv Environment class
246
+ │ └── static/index.html # Interactive dashboard with energy flows
247
  └── scripts/
248
+ └── oracle_test.py # Oracle + heuristic validation + determinism check
249
  ```
250
+
251
+ ---
252
+
253
+ ## API Endpoints
254
+
255
+ | Endpoint | Method | Description |
256
+ |----------|--------|-------------|
257
+ | `/health` | GET | Health check |
258
+ | `/schema` | GET | Action/observation/state JSON schemas |
259
+ | `/metadata` | GET | Environment name and description |
260
+ | `/reset` | POST | Reset environment (OpenEnv standard) |
261
+ | `/step` | POST | Execute action (OpenEnv standard) |
262
+ | `/state` | GET | Current state (OpenEnv standard) |
263
+ | `/ws` | WebSocket | Persistent session (OpenEnv standard) |
264
+ | `/api/reset` | POST | Stateful reset (dashboard) |
265
+ | `/api/step` | POST | Stateful step (dashboard) |
266
+ | `/api/state` | GET | Stateful state (dashboard) |
267
+ | `/tasks` | GET | List available tasks |
268
+ | `/dashboard/` | GET | Interactive web UI |