kuds
/

atari-pong-v4-ppo

Reinforcement Learning

stable-baselines3

PongNoFrameskip-v4

Eval Results (legacy)

Model card Files Files and versions

kuds commited on Oct 16, 2025

Commit

bfc9a77

·

verified ·

1 Parent(s): cc7d468

Update README.md

Files changed (1) hide show

README.md +98 -3

README.md CHANGED Viewed

@@ -1,3 +1,98 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+language:
+  - en
+library_name: stable-baselines3
+tags:
+  - reinforcement-learning
+  - PongNoFrameskip-v4
+model-index:
+  - name: PPO
+    results:
+      - task:
+          type: reinforcement-learning
+          name: reinforcement-learning
+        dataset:
+          name: PongNoFrameskip-v4
+          type: PongNoFrameskip-v4
+        metrics:
+          - type: mean_reward
+            value: 21.00 +/- 00.00
+            name: mean_reward
+            verified: false
+---
+# **DQN** Agent playing **PongNoFrameskip-v4**
+- [Github Repository](https://github.com/kuds/rl-atari-pong)
+- [Google Colab Notebook](https://colab.research.google.com/github/kuds/rl-atari-pong/blob/main/%5BAtari%20Pong%5D%20Single-Agent%20Reinforcement%20Learning%20PPO.ipynb)
+- [Finding Theta - Blog Post](https://www.findingtheta.com/blog/mastering-ataris-pong-with-reinforcement-learning-overcoming-sparse-rewards-and-optimizing-performance)
+Then, you can load the model using the following Python code:
+```python
+import gymnasium as gym
+from stable_baselines3 import PPO
+from stable_baselines3.common.env_util import make_atari_env
+from stable_baselines3.common.vec_env import VecTransposeImage
+from stable_baselines3.common.atari_wrappers import WarpFrame
+# Load the trained model
+model = PPO.load("best-model.zip")
+# Create the environment
+env = make_atari_env("PongNoFrameskip-v4", n_envs=1)
+env = VecFrameStack(env, n_stack=4)
+env = VecTransposeImage(env)
+# Reset the environment
+obs, info = env.reset()
+# Enjoy the trained agent
+for _ in range(1000):
+    action, _states = model.predict(obs, deterministic=True)
+    obs, rewards, terminated, truncated, info = env.step(action)
+    if terminated or truncated:
+        obs, info = env.reset()
+    env.render()
+env.close()
+```
+### Hugging Face Hub
+You can also use the Hugging Face Hub to load the model. First, you need to install the Hugging Face Hub library:
+```bash
+pip install huggingface_hub
+```
+Then, you can load the model from the hub using the following code:
+```python
+from huggingface_hub import hf_hub_download
+import torch as th
+import gymnasium as gym
+from stable_baselines3 import PPO
+from stable_baselines3.common.env_util import make_atari_env
+from stable_baselines3.common.vec_env import VecTransposeImage
+from stable_baselines3.common.atari_wrappers import WarpFrame
+# Download the model from the Hub
+model_path = hf_hub_download(repo_id="kuds/atari-pong-v4-ppo", filename="best-model.zip")
+# Load the model
+model = PPO.load(model_path)
+# Create the environment
+env = make_atari_env("PongNoFrameskip-v4", n_envs=1)
+env = VecFrameStack(env, n_stack=4)
+env = VecTransposeImage(env)
+# Enjoy the trained agent
+obs = env.reset()
+for i in range(1000):
+    action, _states = model.predict(obs, deterministic=True)
+    obs, rewards, dones, info = env.step(action)
+    env.render("human")
+```