REINFORCE Agent on Pixelcopter-PLE-v0

This repository contains a REINFORCE (policy gradient) agent trained on Pixelcopter-PLE-v0.

Evaluation

  • Mean reward: 48.95 ± 42.79
  • Episodes: 20

Algorithm

  • Monte Carlo Policy Gradient
  • Stochastic policy
  • PyTorch implementation
Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading

Evaluation results