ppo-LunarLander-v2 / README.md

Commit History

Model trained for 10 million timesteps with mean_reward=286.17
f923604

sam133 commited on