LunarLander-v2 / results.json
Mattttthew's picture
Trained PPO model with 1,000,000 iterations.
b84aadb verified
raw
history blame contribute delete
157 Bytes
{"mean_reward": 253.8389223, "std_reward": 21.49592603956499, "is_deterministic": true, "n_eval_episodes": 10, "eval_datetime": "2024-01-27T21:49:12.069276"}