Mahmoud103's picture
Update README.md
ff18ab5 verified
metadata
library_name: cleanrl
tags:
  - SpaceInvadersNoFrameskip-v4
  - deep-reinforcement-learning
  - reinforcement-learning
  - cleanrl
model-index:
  - name: PPO
    results:
      - task:
          type: reinforcement-learning
          name: reinforcement-learning
        dataset:
          name: SpaceInvadersNoFrameskip-v4
          type: SpaceInvadersNoFrameskip-v4
        metrics:
          - type: mean_reward
            value: 1000.28 +/- 50.13
            name: mean_reward
            verified: false

PPO Agent playing SpaceInvadersNoFrameskip-v4

This is a trained model of a PPO agent playing SpaceInvadersNoFrameskip-v4 using CleanRL.

Usage (with CleanRL)

import torch
import gymnasium as gym
from PPO_atari import Agent

env = gym.make("SpaceInvadersNoFrameskip-v4")

# Load the model
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
agent = Agent(env).to(device)
agent.load_state_dict(torch.load("model.pth", map_location=device))
agent.eval()

# Run evaluation
obs, _ = env.reset()
done = False
while not done:
    action, _, _, _ = agent.get_action_and_value(torch.tensor(obs).unsqueeze(0).to(device))
    obs, reward, terminated, truncated, _ = env.step(action.cpu().numpy()[0])