garvitsachdeva
/

spindleflow-rl

Reinforcement Learning

stable-baselines3

Model card Files Files and versions

spindleflow-rl / README.md

garvitsachdeva's picture

Add trained SpindleFlow RL policy

901dc66 verified 20 days ago

|

history blame contribute delete

767 Bytes

	---
	license: mit
	tags:
	- reinforcement-learning
	- stable-baselines3
	- sb3-contrib
	- gymnasium
	- multi-agent
	- openenv
	library_name: stable-baselines3
	---

	# SpindleFlow RL — Delegation Policy

	LSTM PPO agent trained on SpindleFlow-v0 (OpenEnv).

	## Training summary
	\| Metric \| Value \|
	\|---\|---\|
	\| Algorithm \| RecurrentPPO (SB3 + sb3-contrib) \|
	\| Total timesteps \| 30,000 \|
	\| Episodes completed \| 13526 \|
	\| First-5 mean reward \| 1.2053 \|
	\| Last-5 mean reward \| 2.2038 \|
	\| Improvement \| +0.9984 \|
	\| Device \| cuda \|

	![Reward Curve](reward_curve.png)

	## Load
	```python
	from sb3_contrib import RecurrentPPO
	from huggingface_hub import hf_hub_download
	model = RecurrentPPO.load(hf_hub_download("garvitsachdeva/spindleflow-rl", "spindleflow_model.zip"))
	```