Policy Gradient agent (REINFORCE) for CartPole-v1
This repository contains a simple Policy Gradient (REINFORCE) agent implemented in PyTorch and trained on CartPole-v1 as part of the Hugging Face Deep Reinforcement Learning Course (Unit 4).
Files:
policy_ep2000.pt: trained model weights (state_dict).pg_config.yml: training configuration (YAML).
Evaluation results
- mean_reward on CartPole-v1self-reported500.0 +/- 0.0