PPO Expert Agent for ALE/SpaceInvaders-v5 (10M Steps)

This is a Proximal Policy Optimization (PPO) agent trained on Atari Space Invaders (v5) using a vectorized environment setup.

Training Details

  • Algorithm: PPO
  • Environment: ALE/SpaceInvaders-v5 (with sticky actions)
  • Total Timesteps: 10,000,000
  • Frame Stacking: 4 frames
  • Terminal on Life Loss: True (during training)

Performance

  • Peak Score observed: 615.0
  • Average Reward (approx): ~300-450 range at 10M steps.
  • Behavior: Learned to clear multiple waves, use shields for cover, and target the Mystery Ship.

Usage

import torch
# Assumes you have the ActorCritic class defined in your script
config = Config() # Using your existing Config class
model = ActorCritic(input_channels=4, action_dim=6) # Space Invaders has 6 actions
model.load_state_dict(torch.load('ppo_final_10M.pt', map_location='cpu'))
model.eval()
Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading