DQN Agent playing SpaceInvadersNoFrameskip-v4
This is a trained model of a DQN agent playing SpaceInvadersNoFrameskip-v4 using the stable-baselines3 library and the Deep Reinforcement Learning Course.
The issue with the environment in the tutorial reporting errors in newer versions of SB3 has been resolved through the ALE module.
Evaluation Results
| Metric | Value |
|---|---|
| Mean Reward | 511.00 |
| Std Reward | 229.03 |
| Min Reward | 215.00 |
| Max Reward | 930.00 |
| Mean Episode Length | 775.40 |
| Score (mean - std) | 281.97 |
| Evaluation Episodes | 10 |
Running Time Reference:
RTX 4060 35% Usage
Buffer size:200000 WSL memory usage:13.5GB
Total step: 10M (Not convergent) Spent:23800s
Usage
from stable_baselines3 import DQN
from stable_baselines3.common.env_util import make_atari_env
from stable_baselines3.common.vec_env import VecFrameStack
import gymnasium as gym
import ale_py
gym.register_envs(ale_py)
env = make_atari_env("ALE/SpaceInvaders-v5", n_envs=1, seed=0)
env = VecFrameStack(env, n_stack=4)
model = DQN.load("dqn-SpaceInvaders")
obs = env.reset()
for i in range(1000):
action, _states = model.predict(obs, deterministic=True)
obs, reward, done, info = env.step(action)
if done:
obs = env.reset()
Training Configuration
- Algorithm: DQN (Deep Q-Network)
- Policy: CnnPolicy
- Total Timesteps: 10,000,000
- Learning Rate: 1e-4
- Buffer Size: 200,000
- Batch Size: 32
- Device: CUDA
- Downloads last month
- 62
Evaluation results
- mean_reward on SpaceInvadersNoFrameskip-v4self-reported511.00 +/- 229.03