MarcLinder commited on
Commit
af401ee
·
verified ·
1 Parent(s): 0e95eb2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -6
README.md CHANGED
@@ -16,7 +16,7 @@ model-index:
16
  type: LunarLander-v2
17
  metrics:
18
  - type: mean_reward
19
- value: 254.93 +/- 22.52
20
  name: mean_reward
21
  verified: false
22
  ---
@@ -26,12 +26,42 @@ This is a trained model of a **PPO** agent playing **LunarLander-v2**
26
  using the [stable-baselines3 library](https://github.com/DLR-RM/stable-baselines3).
27
 
28
  ## Usage (with Stable-baselines3)
29
- TODO: Add your code
 
 
 
 
30
 
 
 
31
 
32
- ```python
33
- from stable_baselines3 import ...
34
- from huggingface_sb3 import load_from_hub
 
 
 
 
 
35
 
36
- ...
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
  ```
 
 
 
16
  type: LunarLander-v2
17
  metrics:
18
  - type: mean_reward
19
+ value: 249.81 +/- 23.73
20
  name: mean_reward
21
  verified: false
22
  ---
 
26
  using the [stable-baselines3 library](https://github.com/DLR-RM/stable-baselines3).
27
 
28
  ## Usage (with Stable-baselines3)
29
+ ```python
30
+ from stable_baselines3 import PPO
31
+ from stable_baselines3.common.envs import LunarLander
32
+ from stable_baselines3.common.env_util import make_vec_env
33
+ from stable_baselines3.common.evaluation import evaluate_policy
34
 
35
+ # Create the LunarLander environment
36
+ env = LunarLander()
37
 
38
+ # Vectorize the environment for parallel training
39
+ vec_env = make_vec_env('LunarLander-v2', n_envs=16)
40
+
41
+ # Instantiate the PPO agent
42
+ model = PPO("MlpPolicy", vec_env, verbose=1)
43
+
44
+ # Train the agent
45
+ model.learn(total_timesteps=int(2e5))
46
 
47
+ # Evaluate the trained agent
48
+ eval_env = LunarLander()
49
+ mean_reward, std_reward = evaluate_policy(model, eval_env, n_eval_episodes=10, deterministic=True)
50
+ print(f"mean_reward={mean_reward:.2f} +/- {std_reward}")
51
+
52
+ # Save the trained model
53
+ model_name = "ppo-LunarLander-v2"
54
+ model.save(model_name)
55
+
56
+ # Package and upload the model to the Hub
57
+ from huggingface_sb3 import package_to_hub
58
+ package_to_hub(model=model,
59
+ model_name=model_name,
60
+ model_architecture="PPO",
61
+ env_id="LunarLander-v2",
62
+ eval_env=eval_env,
63
+ repo_id="your-username/ppo-LunarLander-v2",
64
+ commit_message="Upload PPO LunarLander-v2 trained agent")
65
  ```
66
+
67
+ Ensure to replace `"your-username"` with your Hugging Face username.