spindleflow-rl / README.md
garvitsachdeva's picture
Add trained SpindleFlow RL policy
901dc66 verified
---
license: mit
tags:
- reinforcement-learning
- stable-baselines3
- sb3-contrib
- gymnasium
- multi-agent
- openenv
library_name: stable-baselines3
---
# SpindleFlow RL — Delegation Policy
LSTM PPO agent trained on SpindleFlow-v0 (OpenEnv).
## Training summary
| Metric | Value |
|---|---|
| Algorithm | RecurrentPPO (SB3 + sb3-contrib) |
| Total timesteps | 30,000 |
| Episodes completed | 13526 |
| First-5 mean reward | 1.2053 |
| Last-5 mean reward | 2.2038 |
| Improvement | +0.9984 |
| Device | cuda |
![Reward Curve](reward_curve.png)
## Load
```python
from sb3_contrib import RecurrentPPO
from huggingface_hub import hf_hub_download
model = RecurrentPPO.load(hf_hub_download("garvitsachdeva/spindleflow-rl", "spindleflow_model.zip"))
```