--- license: mit tags: - reinforcement-learning - stable-baselines3 - sb3-contrib - gymnasium - multi-agent - openenv library_name: stable-baselines3 --- # SpindleFlow RL — Delegation Policy LSTM PPO agent trained on SpindleFlow-v0 (OpenEnv). ## Training summary | Metric | Value | |---|---| | Algorithm | RecurrentPPO (SB3 + sb3-contrib) | | Total timesteps | 30,000 | | Episodes completed | 13526 | | First-5 mean reward | 1.2053 | | Last-5 mean reward | 2.2038 | | Improvement | +0.9984 | | Device | cuda | ![Reward Curve](reward_curve.png) ## Load ```python from sb3_contrib import RecurrentPPO from huggingface_hub import hf_hub_download model = RecurrentPPO.load(hf_hub_download("garvitsachdeva/spindleflow-rl", "spindleflow_model.zip")) ```