PPO Agent Playing LunarLandar-v3

This is a trained model of a PPO agent playing LunarLandar-v3.

Hyperparameters

timesteps=2e6,
steps_before_update=1000,
mini_batch_size=64, epochs=3,
lr=3e-4,
gamma=0.99,
gae_lambda=0.95,
clip_coef=0.2,
norm_adv=True,
vf_coef=0.5,
ent_coef=0.05,
max_grad_norm=1.0,
target_kl=0.015
Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading

Evaluation results