--- tags: - LunarLander-v2 - ppo - deep-reinforcement-learning - reinforcement-learning - custom-implementation - deep-rl-course model-index: - name: PPO results: - task: type: reinforcement-learning name: reinforcement-learning dataset: name: LunarLander-v2 type: LunarLander-v2 metrics: - type: mean_reward value: -121.80 +/- 28.82 name: mean_reward verified: false --- # PPO Agent Playing LunarLander-v2 This is a trained model of a PPO agent playing LunarLander-v2. # Hyperparameters ```python exp_name: ppo seed: 1 torch_deterministic: True cuda: True track: False wandb_project_name: cleanRL wandb_entity: None capture_video: False env_id: LunarLander-v2 total_timesteps: 100000 learning_rate: 0.00025 num_envs: 4 num_steps: 128 anneal_lr: True gae: True gamma: 0.99 gae_lambda: 0.95 num_minibatches: 4 update_epochs: 4 norm_adv: True clip_coef: 0.2 clip_vloss: True ent_coef: 0.01 vf_coef: 0.5 max_grad_norm: 0.5 target_kl: None repo_id: LizardAPN/LunarLander-v2-with-ppo batch_size: 512 minibatch_size: 128 ```