metadata
model-index:
- name: PPO LunarLander-v2
results:
- task:
type: reinforcement-learning
name: reinforcement-learning
dataset:
name: LunarLander-v2
type: LunarLander-v2
metrics:
- type: mean_reward
value: '-132.23 +/- 107.40'
name: mean_reward
verified: false
PPO Agent for LunarLander-v2
Mean reward: -132.23 ± 107.40