|
|
--- |
|
|
license: mit |
|
|
tags: |
|
|
- reinforcement-learning |
|
|
- multi-agent |
|
|
- time-series |
|
|
- diffusion-model |
|
|
- energy-management |
|
|
- smart-grid |
|
|
--- |
|
|
# βοΈ SolarSys: Scalable Hierarchical Coordination for Distributed Solar Energy |
|
|
|
|
|
[Source: The SolarSys paper (e.g., your PDF) is the primary source for all claims below.] |
|
|
|
|
|
[cite_start]SolarSys is a novel **Hierarchical Multi-Agent Reinforcement Learning (HRL)** system designed to manage energy storage and peer-to-peer (P2P) trading across large communities of solar-equipped residences[cite: 10]. This repository contains the full source code for the SolarSys system, including the trained policies, the custom Gym environment, and the hierarchical diffusion model used for data augmentation. |
|
|
|
|
|
--- |
|
|
|
|
|
## π Key Features and Performance |
|
|
|
|
|
[cite_start]SolarSys addresses the scalability limitations of traditional Multi-Agent RL (MARL) methods (like MAPPO and MADDPG) in large Virtual Power Plants (VPPs)[cite: 9, 145]. |
|
|
|
|
|
| Metric | SolarSys Performance (1000 Agents) | Key Mechanism | |
|
|
| :--- | :--- | :--- | |
|
|
| **Grid Import Reduction** | [cite_start]$27.48 \pm 0.42\%$ [cite: 18] | [cite_start]Two-tier control scheme [cite: 12, 69] | |
|
|
| **Daytime Solar Utilization** | [cite_start]$82.76 \pm 5.11\%$ [cite: 18] | [cite_start]Intra-cluster MAPPO optimization [cite: 13] | |
|
|
| **Fairness (Jain's Index)** | [cite_start]0.773 [cite: 18] | [cite_start]Fairness term in reward function [cite: 391, 511] | |
|
|
| **Scalability** | [cite_start]Stable convergence at 1000+ agents [cite: 504] | [cite_start]Mean-Field Coordination at the Inter-Cluster layer [cite: 14] | |
|
|
|
|
|
--- |
|
|
|
|
|
## π§ System Architecture |
|
|
|
|
|
The core of SolarSys is a two-level decision hierarchy: |
|
|
|
|
|
1. [cite_start]**Low-Level (Intra-Cluster):** Individual households use a **MAPPO** agent to make instantaneous decisions (charge, discharge, local P2P trade, grid trade) based on local meter readings and price signals[cite: 13, 313]. |
|
|
2. [cite_start]**High-Level (Inter-Cluster):** Cluster Managers use a **Mean-Field** policy to coordinate bulk energy transfers between clusters, ensuring the overall system remains balanced against grid constraints[cite: 14, 314]. |
|
|
|
|
|
|
|
|
|
|
|
--- |
|
|
|
|
|
## π Data Generation Framework |
|
|
|
|
|
[cite_start]To enable large-scale simulation with realistic temporal dynamics, SolarSys includes a **Hierarchical Diffusion Model** for generating synthetic, long-duration energy profiles that maintain both long-term (seasonal/monthly) and short-term (daily/hourly) characteristics[cite: 254, 255]. |
|
|
|
|
|
* [cite_start]**Model:** Hierarchical Diffusion U-Net [cite: 254, 255] |
|
|
* [cite_start]**Input:** Household ID and Day-of-Year conditioning [cite: 256] |
|
|
* **Output:** High-resolution time series for Grid Usage and Solar Generation (kWh). |
|
|
|
|
|
|
|
|
|
|
|
--- |
|
|
|
|
|
## π Repository Structure |
|
|
|
|
|
The project is organized into core modules and data folders. |
|
|
|
|
|
```tree |
|
|
SolarSys/ |
|
|
βββ data/ |
|
|
β βββ per_house/ # Raw CSVs for diffusion model training |
|
|
β βββ training/ # Cleaned RL training datasets |
|
|
β βββ testing/ # Cleaned RL evaluation datasets |
|
|
βββ models/ |
|
|
β βββ diffusion_models/ # Trained Hierarchical Diffusion Model checkpoints |
|
|
β βββ mappo_models/ # Trained MAPPO baselines and low-level agents |
|
|
β βββ inter_agent_models/ # Trained MeanField high-level coordinator |
|
|
βββ Environment/ |
|
|
β βββ __init__.py |
|
|
β βββ solar_sys_environment.py # Custom Gym environment for flat RL |
|
|
βββ cluster/ |
|
|
β βββ __init__.py |
|
|
β βββ inter_cluster_coordinator.py # Logic for high-level trade matching |
|
|
βββ trainers/ |
|
|
βββ __init__.py |
|
|
βββ hierarchical_train.py # Main SolarSys HRL training script |
|
|
βββ evaluation_scripts/ # Scripts for baselines (PG, MADDPG, MAPPO, MFAC) |
|
|
|
|
|
|