PPO Agent playing SpaceInvadersNoFrameskip-v4

This is a trained model of a PPO agent playing SpaceInvadersNoFrameskip-v4 using CleanRL.

Usage (with CleanRL)

import torch
import gymnasium as gym
from PPO_atari import Agent

env = gym.make("SpaceInvadersNoFrameskip-v4")

# Load the model
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
agent = Agent(env).to(device)
agent.load_state_dict(torch.load("model.pth", map_location=device))
agent.eval()

# Run evaluation
obs, _ = env.reset()
done = False
while not done:
    action, _, _, _ = agent.get_action_and_value(torch.tensor(obs).unsqueeze(0).to(device))
    obs, reward, terminated, truncated, _ = env.step(action.cpu().numpy()[0])

Downloads last month: -; Downloads are not tracked for this model. How to track

Video Preview

Reinforcement Learning

Evaluation results

mean_reward on SpaceInvadersNoFrameskip-v4
self-reported

1000.28 +/- 50.13