REINFORCE Agent for Pixelcopter-PLE-v0

Model Description

This repository contains a trained REINFORCE (Policy Gradient) reinforcement learning agent that has learned to play Pixelcopter-PLE-v0, a challenging helicopter navigation game from the PyGame Learning Environment (PLE). The agent uses policy gradient methods to learn optimal flight control strategies through trial and error.

Model Details

  • Algorithm: REINFORCE (Monte Carlo Policy Gradient)
  • Environment: Pixelcopter-PLE-v0 (PyGame Learning Environment)
  • Framework: Custom implementation following Deep RL Course guidelines
  • Task Type: Discrete Control (Binary Actions)
  • Action Space: Discrete (2 actions: do nothing or thrust up)
  • Observation Space: Visual/pixel-based or feature-based state representation

Environment Overview

Pixelcopter-PLE-v0 is a classic helicopter control game where:

  • Objective: Navigate a helicopter through obstacles without crashing
  • Challenge: Requires precise timing and control to avoid ceiling, floor, and obstacles
  • Physics: Gravity constantly pulls the helicopter down; player must apply thrust to maintain altitude
  • Scoring: Points are awarded for surviving longer and successfully navigating through gaps
  • Difficulty: Requires learning temporal dependencies and precise action timing

Performance

The trained REINFORCE agent achieves the following performance metrics:

  • Mean Reward: 13.10 ± 6.89
  • Performance Analysis: This represents solid performance for this challenging environment
  • Consistency: The standard deviation indicates moderate variability, which is expected for policy gradient methods

Educational Resources

This model was developed following the Deep Reinforcement Learning Course Unit 4:

For comprehensive learning about REINFORCE and policy gradient methods, refer to the complete course materials.

Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading

Evaluation results