This is a REINFORCE agent trained on Pixelcopter (PLE) using a custom environment wrapper.
Mean reward: 4.84 ± 5.76
-