reeeemo commited on
Commit
4062c8c
·
verified ·
1 Parent(s): 3a243fe

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -1
README.md CHANGED
@@ -25,7 +25,26 @@ model-index:
25
  This is a trained model of a **PPO** agent playing **LunarLander-v2**
26
  using the [stable-baselines3 library](https://github.com/DLR-RM/stable-baselines3).
27
 
28
- Hyperparameters were optimized with [Optuna](https://pypi.org/project/optuna/).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
 
30
  ## Usage
31
 
 
25
  This is a trained model of a **PPO** agent playing **LunarLander-v2**
26
  using the [stable-baselines3 library](https://github.com/DLR-RM/stable-baselines3).
27
 
28
+ Hyperparameters used to train were optimized with [Optuna](https://pypi.org/project/optuna/).
29
+
30
+ ```python
31
+ {
32
+ "learning_rate": 0.00038779746460731866,
33
+ "n_steps": 2048,
34
+ "batch_size": 128,
35
+ "n_epochs": 13,
36
+ "gamma": 0.9927390555180292,
37
+ "gae_lambda": 0.9353501463066322,
38
+ "clip_range": clip_range,
39
+ "ent_coef": 0.007068533587811773,
40
+ "policy_kwargs": {
41
+ "net_arch": {'pi': [512, 512], 'vf': [512, 512]},
42
+ "activation_fn": nn.Tanh
43
+ },
44
+ }
45
+ ```
46
+
47
+ > Learning rate was used as an initial value for a linear scheduler during training. See [this github issue](https://github.com/DLR-RM/stable-baselines3/issues/246) for more information
48
 
49
  ## Usage
50