| | --- |
| | library_name: cleanrl |
| | tags: |
| | - SpaceInvadersNoFrameskip-v4 |
| | - deep-reinforcement-learning |
| | - reinforcement-learning |
| | - cleanrl |
| | model-index: |
| | - name: PPO |
| | results: |
| | - task: |
| | type: reinforcement-learning |
| | name: reinforcement-learning |
| | dataset: |
| | name: SpaceInvadersNoFrameskip-v4 |
| | type: SpaceInvadersNoFrameskip-v4 |
| | metrics: |
| | - type: mean_reward |
| | value: 1000.28 +/- 50.13 |
| | name: mean_reward |
| | verified: false |
| | --- |
| | |
| | # **PPO** Agent playing **SpaceInvadersNoFrameskip-v4** |
| |
|
| | This is a trained model of a **PPO** agent playing **SpaceInvadersNoFrameskip-v4** |
| | using [CleanRL](https://github.com/vwxyzjn/cleanrl). |
| |
|
| | ## Usage (with CleanRL) |
| |
|
| | ```python |
| | import torch |
| | import gymnasium as gym |
| | from PPO_atari import Agent |
| | |
| | env = gym.make("SpaceInvadersNoFrameskip-v4") |
| | |
| | # Load the model |
| | device = torch.device("cuda" if torch.cuda.is_available() else "cpu") |
| | agent = Agent(env).to(device) |
| | agent.load_state_dict(torch.load("model.pth", map_location=device)) |
| | agent.eval() |
| | |
| | # Run evaluation |
| | obs, _ = env.reset() |
| | done = False |
| | while not done: |
| | action, _, _, _ = agent.get_action_and_value(torch.tensor(obs).unsqueeze(0).to(device)) |
| | obs, reward, terminated, truncated, _ = env.step(action.cpu().numpy()[0]) |