reeeemo
/

ppo-LunarLander-v2

Reinforcement Learning

stable-baselines3

deep-reinforcement-learning

Eval Results (legacy)

Model card Files Files and versions

reeeemo commited on Dec 27, 2025

Commit

4062c8c

·

verified ·

1 Parent(s): 3a243fe

Update README.md

Files changed (1) hide show

README.md +20 -1

README.md CHANGED Viewed

@@ -25,7 +25,26 @@ model-index:
 This is a trained model of a **PPO** agent playing **LunarLander-v2**
 using the [stable-baselines3 library](https://github.com/DLR-RM/stable-baselines3).
-Hyperparameters were optimized with [Optuna](https://pypi.org/project/optuna/).
 ## Usage

 This is a trained model of a **PPO** agent playing **LunarLander-v2**
 using the [stable-baselines3 library](https://github.com/DLR-RM/stable-baselines3).
+Hyperparameters used to train were optimized with [Optuna](https://pypi.org/project/optuna/).
+```python
+{
+  "learning_rate": 0.00038779746460731866,
+  "n_steps": 2048,
+  "batch_size": 128,
+  "n_epochs": 13,
+  "gamma": 0.9927390555180292,
+  "gae_lambda": 0.9353501463066322,
+  "clip_range": clip_range,
+  "ent_coef": 0.007068533587811773,
+  "policy_kwargs": {
+    "net_arch": {'pi': [512, 512], 'vf': [512, 512]},
+    "activation_fn": nn.Tanh
+  },
+}
+```
+> Learning rate was used as an initial value for a linear scheduler during training. See [this github issue](https://github.com/DLR-RM/stable-baselines3/issues/246) for more information
 ## Usage