Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

CoreyMorris
/
ppo-Pixelcopter-PLE-v0

Reinforcement Learning
stable-baselines3
Pixelcopter-PLE-v0
deep-reinforcement-learning
Eval Results (legacy)
Model card Files Files and versions
xet
Community
ppo-Pixelcopter-PLE-v0
301 kB
  • 1 contributor
History: 2 commits
CoreyMorris's picture
CoreyMorris
SB3 PPO. Vectorized 16 env. ~ 9_000_000 timesteps of training. mean_reward=163 +/- 103 . Training for an additional 50_000_000 timesteps resulted in a worse reward when evaluating
28a0b97 about 3 years ago
  • Pixelcopter-PLE-v0_4
    SB3 PPO. Vectorized 16 env. ~ 9_000_000 timesteps of training. mean_reward=163 +/- 103 . Training for an additional 50_000_000 timesteps resulted in a worse reward when evaluating about 3 years ago
  • .gitattributes
    1.48 kB
    initial commit about 3 years ago
  • Pixelcopter-PLE-v0_4.zip
    143 kB
    xet
    SB3 PPO. Vectorized 16 env. ~ 9_000_000 timesteps of training. mean_reward=163 +/- 103 . Training for an additional 50_000_000 timesteps resulted in a worse reward when evaluating about 3 years ago
  • README.md
    805 Bytes
    SB3 PPO. Vectorized 16 env. ~ 9_000_000 timesteps of training. mean_reward=163 +/- 103 . Training for an additional 50_000_000 timesteps resulted in a worse reward when evaluating about 3 years ago
  • config.json
    12.9 kB
    SB3 PPO. Vectorized 16 env. ~ 9_000_000 timesteps of training. mean_reward=163 +/- 103 . Training for an additional 50_000_000 timesteps resulted in a worse reward when evaluating about 3 years ago
  • results.json
    152 Bytes
    SB3 PPO. Vectorized 16 env. ~ 9_000_000 timesteps of training. mean_reward=163 +/- 103 . Training for an additional 50_000_000 timesteps resulted in a worse reward when evaluating about 3 years ago