safe-autonomous-systems
/

ma-ppo-RBC2D-medium-v0

Reinforcement Learning

stable-baselines3

deep-reinforcement-learning

active-flow-control

RBC2D-medium-v0

Eval Results (legacy)

Model card Files Files and versions

ma-ppo-RBC2D-medium-v0 / README.md

becktepe's picture

Add files using upload-large-folder tool

a319d1f verified 1 day ago

|

history blame contribute delete

2.07 kB

	---
	library_name: stable-baselines3
	tags:
	- reinforcement-learning
	- stable-baselines3
	- deep-reinforcement-learning
	- fluidgym
	- active-flow-control
	- fluid-dynamics
	- simulation
	- RBC2D-medium-v0
	model-index:
	- name: PPO-RBC2D-medium-v0
	results:
	- task:
	type: reinforcement-learning
	name: reinforcement-learning
	dataset:
	name: FluidGym-RBC2D-medium-v0
	type: fluidgym
	metrics:
	- type: mean_reward
	value: 0.06
	name: mean_reward


	---

	# PPO on RBC2D-medium-v0 (FluidGym)

	This repository is part of the FluidGym benchmark results. It contains trained Stable Baselines3 agents for the specialized RBC2D-medium-v0 environment.

	## Evaluation Results

	### Global Performance (Aggregated across 5 seeds)
	Mean Reward: 0.06 ± 0.15

	### Per-Seed Statistics
	\| Run \| Mean Reward \| Std Dev \|
	\| --- \| --- \| --- \|
	\| Seed 0 \| -0.06 \| 1.56 \|
	\| Seed 1 \| 0.27 \| 1.35 \|
	\| Seed 2 \| 0.21 \| 0.93 \|
	\| Seed 3 \| -0.12 \| 1.43 \|
	\| Seed 4 \| 0.00 \| 1.38 \|

	## About FluidGym
	FluidGym is a benchmark for reinforcement learning in active flow control.

	## Usage
	Each seed is contained in its own subdirectory. You can load a model using:
	```python
	from stable_baselines3 import PPO
	model = PPO.load("0/ckpt_latest.zip")

	Important: The models were trained using ```fluidgym==0.0.2```. In order to use
	them with newer versions of FluidGym, you need to wrap the environment with a
	`FlattenObservation` wrapper as shown below:
	```python
	import fluidgym
	from fluidgym.wrappers import FlattenObservation
	from stable_baselines3 import PPO

	env = fluidgym.make("RBC2D-medium-v0")
	env = FlattenObservation(env)
	model = PPO.load("path_to_model/ckpt_latest.zip")

	obs, info = env.reset(seed=42)

	action, _ = model.predict(obs, deterministic=True)
	obs, reward, terminated, truncated, info = env.step(action)
	```

	## References

	* [Plug-and-Play Benchmarking of Reinforcement Learning Algorithms for Large-Scale Flow Control](http://arxiv.org/abs/2601.15015)
	* [FluidGym GitHub Repository](https://github.com/safe-autonomous-systems/fluidgym)