PPO Agent Playing CartPole-v1

Trained with a minimal CleanRL-style PPO implementation in Google Colab.

Results

  • Mean reward: 83.60
  • Std reward: 50.09
Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading

Evaluation results