EnergyTrading / README.md
SolarSys2026's picture
Update README.md
2c72ed7 verified
---
license: mit
tags:
- reinforcement-learning
- multi-agent
- time-series
- diffusion-model
- energy-management
- smart-grid
---
# SolarSys: Scalable Hierarchical Coordination for Distributed Solar Energy
## Abstract
Distributed solar energy resources (DSERs) in residential power systems are rapidly increasing due to declining photovoltaic module prices. This increase has led to the development of virtual power plants (VPPs) that coordinate local generation to support reliable and efficient grid operation. However, many existing VPPs rely on centralized control, where a central operator aggregates household-level information and determines trading outcomes, preventing individual users from independently and concurrently trading surplus energy.
As a result, researchers have explored decentralized coordination, where individual households trade surplus solar energy directly with others to improve utilization and fairness within residential networks. However, such approaches perform well only in small deployments and become inefficient at scale as the number of participating households increases.
To address these limitations, we design SolarSys, a hierarchical coordination system for peer-to-peer energy trading in distributed
solar and storage communities. SolarSys integrates local power sensing from smart meters and inverters with edge computation and distributed coordination to manage energy usage across large residential networks. The system is organized hierarchically to support decision making at both the household and cluster levels. Each household node measures its generation, demand, and battery state, and makes local energy management decisions using a policy trained with Multi-Agent Proximal Policy Optimization to improve local solar usage and maintain cluster-level energy balance.
Across clusters, SolarSys leverages a mean-field multi-agent reinforcement learning approach that updates high-level decisions using aggregated cluster conditions. This design allows clusters to adjust their decisions so the overall system remains consistent with practical grid constraints. In addition, we deploy SolarSys on a Raspberry Pi and find that the learned policies can run efficiently on low-power edge hardware.
We evaluate SolarSys using real smart meter data from seven residential communities. We find that SolarSys reduces the energy drawn from the grid by 27.48 Β± 0.42\%, increases daytime solar utilization to 82.76 Β± 5.11\%, and improves fairness to 0.773 (Jain’s Index). The results show that SolarSys enables efficient, fair, and scalable peer-to-peer energy trading in large-scale virtual power plant deployments.
---
## System Architecture
The core of SolarSys is a two-level decision hierarchy:
1. **Low-Level (Intra-Cluster):** Individual households use a **MAPPO** agent to make instantaneous decisions (charge, discharge, local P2P trade, grid trade) based on local meter readings and price signals.
2. **High-Level (Inter-Cluster):** Cluster Managers use a **Mean-Field** policy to coordinate bulk energy transfers between clusters, ensuring the overall system remains balanced against grid constraints.
![SolarSys Hierarchical Architecture](assets/solarSys.png)
---
## Data Generation Framework
To enable large-scale simulation with realistic temporal dynamics, SolarSys includes a **Hierarchical Diffusion Model** for generating synthetic, long-duration energy profiles that maintain both long-term (seasonal/monthly) and short-term (daily/hourly) characteristics.
* **Model:** Hierarchical Diffusion U-Net
* **Input:** Location based housing dataset(eg. Ausgrid dataset, newyork dataset)
* **Output:** High-resolution time series for Grid Usage and Solar Generation (kWh).
![Hierarchical Diffusion Framework](assets/hier_diffusion.png)
To ensure all necessary libraries are available, install dependencies using the provided `requirements.txt` file:
```bash
pip install -r requirements.txt
---
## Repository Structure
The project is organized into core modules and data folders.
```tree
SolarSys_Hugging_face/
β”œβ”€β”€ assets/
β”œβ”€β”€ requirements.txt # Project Dependencies
β”œβ”€β”€ Dataset/ # Energy Data (Raw, Processed, and Synthetic)
β”‚ β”œβ”€β”€ Data/
β”‚ β”‚ β”œβ”€β”€ Ausgrid_processed_data.zip # Processed data (Australia)
β”‚ β”‚ β”œβ”€β”€ Ausgrid_raw_data/
β”‚ β”‚ └── Pecan_street_processed_data.zip # Processed data (US)
β”‚ └── Generated_Data.zip # Example synthetic data (2000 agents)
β”œβ”€β”€ Data_generation_tool_kit/ # Hierarchical Diffusion Model Code
β”‚ β”œβ”€β”€ Hier_diffusion_energy/
β”‚ β”‚ β”œβ”€β”€ hierarchial_diffusion_model.py # H-Diffusion U-Net Architecture
β”‚ β”‚ └── global_scaler.gz # Global normalization scaler, please delete and rerun if had problem
β”‚ β”œβ”€β”€ dataloader.py #
β”‚ β”œβ”€β”€ train.py # Diffusion model training script
β”‚ └── generate.py # Script to generate long-term sequences
β”œβ”€β”€ Models/ # Trained RL Policies for trial, please retrain your model for best results
β”‚ β”œβ”€β”€ 100agents_10size/
β”‚ β”‚ └── models/
β”‚ β”‚ β”œβ”€β”€ inter_ep10000.pth # High-Level (Inter-Cluster) policy
β”‚ β”‚ └── low_clusterX_ep10000.pth # Low-Level (Intra-Cluster) policies
β”‚ └── (Additional scaling folders) # e.g., 5agents_5size/, 1000agents_20size/, etc.
β”œβ”€β”€ SolarSys/ # SolarSys Implementation (MAPPO + MeanField)
β”‚ β”œβ”€β”€ Environment/
β”‚ β”‚ β”œβ”€β”€ cluster_env_wrapper.py # Vectorized environment wrapper
β”‚ β”‚ └── solar_sys_environment.py # Core Gym environment definition
β”‚ β”œβ”€β”€ mappo/ # MAPPO policy definition (Intra-Cluster)
β”‚ β”‚ └── trainer/
β”‚ β”‚ └── mappo.py # MAPPO algorithm
β”‚ β”œβ”€β”€ meanfield/ # MeanField policy definition (Inter-Cluster)
β”‚ β”‚ └── trainer/
β”‚ β”‚ └── meanfield.py # MeanField algorithm core
β”‚ β”œβ”€β”€ cluster.py # Inter-Cluster Coordinator logic
β”‚ β”œβ”€β”€ training_freezing.py # Main SolarSys HRL training script
β”‚ └── cluster_evaluation.py # Evaluation script for SolarSys
└── Other_algorithms/ # Baselines
β”œβ”€β”€ HC_MAPPO/ # Hierarchical Cluster MAPPO
β”‚ β”œβ”€β”€ Environment/
β”‚ β”œβ”€β”€ HC_MAPPO_train.py # Training script for HC-MAPPO
β”‚ └── HC_MAPPO_evaluation.py # Evaluation script for HC-MAPPO
└── Flat_System/ # Flat Baselines
β”œβ”€β”€ maddpg/ # Multi-Agent DDPG
β”‚ β”œβ”€β”€ maddpg_train.py
β”‚ └── maddpg_evaluation.py
β”œβ”€β”€ mappo/ # Flat MAPPO
β”‚ β”œβ”€β”€ mappo_train.py
β”‚ └── mappo_evaluation.py
β”œβ”€β”€ meanfield/ # Flat MeanField Actor-Critic (MFAC)
β”‚ β”œβ”€β”€ meanfield_train.py
β”‚ └── meanfield_evaluation.py
β”œβ”€β”€ PG/ # Policy Gradient (PG)
β”‚ β”œβ”€β”€ pg_train.py
β”‚ └── pg_evaluation.py
└── solar_sys_environment.py # Copy of the flat environment for baselines