PPO Agent playing SpaceInvadersNoFrameskip-v4
This is a trained model of a PPO agent playing SpaceInvadersNoFrameskip-v4 using CleanRL.
Usage (with CleanRL)
import torch
import gymnasium as gym
from PPO_atari import Agent
env = gym.make("SpaceInvadersNoFrameskip-v4")
# Load the model
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
agent = Agent(env).to(device)
agent.load_state_dict(torch.load("model.pth", map_location=device))
agent.eval()
# Run evaluation
obs, _ = env.reset()
done = False
while not done:
action, _, _, _ = agent.get_action_and_value(torch.tensor(obs).unsqueeze(0).to(device))
obs, reward, terminated, truncated, _ = env.step(action.cpu().numpy()[0])
Evaluation results
- mean_reward on SpaceInvadersNoFrameskip-v4self-reported1000.28 +/- 50.13