Mahmoud103's picture
Update README.md
ff18ab5 verified
---
library_name: cleanrl
tags:
- SpaceInvadersNoFrameskip-v4
- deep-reinforcement-learning
- reinforcement-learning
- cleanrl
model-index:
- name: PPO
results:
- task:
type: reinforcement-learning
name: reinforcement-learning
dataset:
name: SpaceInvadersNoFrameskip-v4
type: SpaceInvadersNoFrameskip-v4
metrics:
- type: mean_reward
value: 1000.28 +/- 50.13
name: mean_reward
verified: false
---
# **PPO** Agent playing **SpaceInvadersNoFrameskip-v4**
This is a trained model of a **PPO** agent playing **SpaceInvadersNoFrameskip-v4**
using [CleanRL](https://github.com/vwxyzjn/cleanrl).
## Usage (with CleanRL)
```python
import torch
import gymnasium as gym
from PPO_atari import Agent
env = gym.make("SpaceInvadersNoFrameskip-v4")
# Load the model
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
agent = Agent(env).to(device)
agent.load_state_dict(torch.load("model.pth", map_location=device))
agent.eval()
# Run evaluation
obs, _ = env.reset()
done = False
while not done:
action, _, _, _ = agent.get_action_and_value(torch.tensor(obs).unsqueeze(0).to(device))
obs, reward, terminated, truncated, _ = env.step(action.cpu().numpy()[0])