ch-bz commited on
Commit
066012a
·
verified ·
1 Parent(s): 7c62a56

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +49 -37
README.md CHANGED
@@ -1,37 +1,49 @@
1
- ---
2
- library_name: stable-baselines3
3
- tags:
4
- - LunarLander-v2
5
- - deep-reinforcement-learning
6
- - reinforcement-learning
7
- - stable-baselines3
8
- model-index:
9
- - name: PPO
10
- results:
11
- - task:
12
- type: reinforcement-learning
13
- name: reinforcement-learning
14
- dataset:
15
- name: LunarLander-v2
16
- type: LunarLander-v2
17
- metrics:
18
- - type: mean_reward
19
- value: 265.37 +/- 25.58
20
- name: mean_reward
21
- verified: false
22
- ---
23
-
24
- # **PPO** Agent playing **LunarLander-v2**
25
- This is a trained model of a **PPO** agent playing **LunarLander-v2**
26
- using the [stable-baselines3 library](https://github.com/DLR-RM/stable-baselines3).
27
-
28
- ## Usage (with Stable-baselines3)
29
- TODO: Add your code
30
-
31
-
32
- ```python
33
- from stable_baselines3 import ...
34
- from huggingface_sb3 import load_from_hub
35
-
36
- ...
37
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: stable-baselines3
3
+ tags:
4
+ - LunarLander-v2
5
+ - deep-reinforcement-learning
6
+ - reinforcement-learning
7
+ - stable-baselines3
8
+ model-index:
9
+ - name: PPO
10
+ results:
11
+ - task:
12
+ type: reinforcement-learning
13
+ name: reinforcement-learning
14
+ dataset:
15
+ name: LunarLander-v2
16
+ type: LunarLander-v2
17
+ metrics:
18
+ - type: mean_reward
19
+ value: 265.37 +/- 25.58
20
+ name: mean_reward
21
+ verified: false
22
+ ---
23
+
24
+ # **PPO** Agent playing **LunarLander-v2**
25
+ This is a trained model of a **PPO** agent playing **LunarLander-v2**
26
+ using the [stable-baselines3 library](https://github.com/DLR-RM/stable-baselines3).<br>
27
+ Created during the 'Deep RL Course'(https://huggingface.co/learn/deep-rl-course/unit0/introduction). Trained with 2000000 timesteps.
28
+
29
+ ## Usage (with Stable-baselines3)
30
+ ```python
31
+ import gymnasium as gym
32
+ from stable_baselines3 import PPO
33
+ from stable_baselines3.common.env_util import make_vec_env
34
+ from huggingface_sb3 import load_from_hub
35
+
36
+ # Load the model
37
+ model_name = "LunarLander-v2"
38
+ model_path = load_from_hub(repo_id="ch-bz/ppo-" + model_name, filename=model_name + ".zip")
39
+ model = PPO.load(model_path)
40
+
41
+ # Demonstrate the model with 4 parallel instances
42
+ vec_env = make_vec_env(model_name, n_envs=4)
43
+
44
+ obs = vec_env.reset()
45
+ while True:
46
+ action, _states = model.predict(obs)
47
+ obs, rewards, dones, info = vec_env.step(action)
48
+ vec_env.render("human")
49
+ ```