Reinforcement Learning
stable-baselines3
SpaceInvadersNoFrameskip-v4
deep-reinforcement-learning
Eval Results (legacy)
Instructions to use KraTUZen/dqn-SpaceInvadersNoFrameskip-v4 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- stable-baselines3
How to use KraTUZen/dqn-SpaceInvadersNoFrameskip-v4 with stable-baselines3:
from huggingface_sb3 import load_from_hub checkpoint = load_from_hub( repo_id="KraTUZen/dqn-SpaceInvadersNoFrameskip-v4", filename="{MODEL FILENAME}.zip", ) - Notebooks
- Google Colab
- Kaggle
πΎ DQN Agent on SpaceInvadersNoFrameskip-v4
This repository contains a trained Deep Q-Network (DQN) agent that plays the SpaceInvadersNoFrameskip-v4 environment using the Stable-Baselines3 library (github.com in Bing).
π Model Card
Model Name: dqn-SpaceInvadersNoFrameskip-v4
Environment: SpaceInvadersNoFrameskip-v4
Algorithm: DQN (Deep Q-Network)
Performance Metric:
- Mean Reward:
565.50 Β± 114.03 - Verification: Not yet independently verified
π Usage
from stable_baselines3 import DQN
from huggingface_sb3 import load_from_hub
import gym
# Load the trained DQN model
model = load_from_hub(
repo_id="KraTUZen/dqn-SpaceInvadersNoFrameskip-v4",
filename="dqn.pkl"
)
# Initialize environment
env = gym.make(model["env_id"])
π§ Notes
- The agent is trained using DQN, a value-based deep reinforcement learning algorithm.
- The environment is SpaceInvadersNoFrameskip-v4, a classic Atari game where the agent must shoot down alien invaders.
- The serialized Q-network is stored in
dqn.pkl.
π Repository Structure
dqn.pklβ Trained Q-network weightsREADME.mdβ Documentation and usage guide
β Results
- The agent learns to maximize score by shooting invaders while avoiding losing lives.
- Demonstrates stable convergence using DQN, balancing exploration and exploitation.
π Environment Overview
- Observation Space: Pixel-based visual input (Atari frames)
- Action Space: Discrete (move left, move right, fire)
- Objective: Survive and maximize score by destroying invaders
- Reward: Positive reward for hitting invaders, penalties for losing lives
π Learning Highlights
- Algorithm: DQN (Deep Q-Network)
- Update Rule: Q-learning with experience replay and target networks
- Strengths: Handles high-dimensional visual input effectively
- Limitations: Sensitive to hyperparameter tuning and replay buffer size
- Downloads last month
- 12
Evaluation results
- mean_reward on SpaceInvadersNoFrameskip-v4self-reported565.50 +/- 114.03