SB3 PPO. Vectorized 16 env. ~ 9_000_000 timesteps of training. mean_reward=163 +/- 103 . Training for an additional 50_000_000 timesteps resulted in a worse reward when evaluating
28a0b97 - Xet hash:
- ab9002a3499f5eb8c30a95dd3912f1c90d6e18734c7ecc4c5a06758428bb456b
- Size of remote file:
- 143 kB
- SHA256:
- c8ec5b61cca311960bc2daad573fd804669e6a07337c98f190ff6b8dc7fe7f10
·
Xet efficiently stores Large Files inside Git, intelligently splitting files into unique chunks and accelerating uploads and downloads. More info.