--- library_name: cleanrl tags: - SpaceInvadersNoFrameskip-v4 - deep-reinforcement-learning - reinforcement-learning - cleanrl model-index: - name: PPO results: - task: type: reinforcement-learning name: reinforcement-learning dataset: name: SpaceInvadersNoFrameskip-v4 type: SpaceInvadersNoFrameskip-v4 metrics: - type: mean_reward value: 1000.28 +/- 50.13 name: mean_reward verified: false --- # **PPO** Agent playing **SpaceInvadersNoFrameskip-v4** This is a trained model of a **PPO** agent playing **SpaceInvadersNoFrameskip-v4** using [CleanRL](https://github.com/vwxyzjn/cleanrl). ## Usage (with CleanRL) ```python import torch import gymnasium as gym from PPO_atari import Agent env = gym.make("SpaceInvadersNoFrameskip-v4") # Load the model device = torch.device("cuda" if torch.cuda.is_available() else "cpu") agent = Agent(env).to(device) agent.load_state_dict(torch.load("model.pth", map_location=device)) agent.eval() # Run evaluation obs, _ = env.reset() done = False while not done: action, _, _, _ = agent.get_action_and_value(torch.tensor(obs).unsqueeze(0).to(device)) obs, reward, terminated, truncated, _ = env.step(action.cpu().numpy()[0])