Update README.md
Browse files
README.md
CHANGED
|
@@ -8,7 +8,18 @@ tags:
|
|
| 8 |
- lunar-lander
|
| 9 |
model-index:
|
| 10 |
- name: lunarlander-ppo
|
| 11 |
-
results:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 12 |
license: apache-2.0
|
| 13 |
---
|
| 14 |
|
|
@@ -24,5 +35,23 @@ This is a PPO-trained agent for the **LunarLander-v3** environment using Stable-
|
|
| 24 |
- Timesteps: 2.5M
|
| 25 |
- Mean Reward: ~290
|
| 26 |
|
| 27 |
-
|
| 28 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 8 |
- lunar-lander
|
| 9 |
model-index:
|
| 10 |
- name: lunarlander-ppo
|
| 11 |
+
results:
|
| 12 |
+
- task:
|
| 13 |
+
type: reinforcement-learning
|
| 14 |
+
name: Reinforcement Learning
|
| 15 |
+
dataset:
|
| 16 |
+
name: LunarLander-v3
|
| 17 |
+
type: gymnasium
|
| 18 |
+
metrics:
|
| 19 |
+
- name: Mean Reward
|
| 20 |
+
type: reward
|
| 21 |
+
value: 290.40
|
| 22 |
+
verified: false
|
| 23 |
license: apache-2.0
|
| 24 |
---
|
| 25 |
|
|
|
|
| 35 |
- Timesteps: 2.5M
|
| 36 |
- Mean Reward: ~290
|
| 37 |
|
| 38 |
+
## 🛠 Usage
|
| 39 |
+
|
| 40 |
+
You can load and test the trained agent like this:
|
| 41 |
+
|
| 42 |
+
```python
|
| 43 |
+
import gymnasium as gym
|
| 44 |
+
from stable_baselines3 import PPO
|
| 45 |
+
|
| 46 |
+
# Load environment
|
| 47 |
+
env = gym.make("LunarLander-v3", render_mode="human")
|
| 48 |
+
|
| 49 |
+
# Load the pretrained model from Hugging Face Hub
|
| 50 |
+
model = PPO.load("Vishand03/lunarlander-ppo")
|
| 51 |
+
|
| 52 |
+
# Run a single episode
|
| 53 |
+
obs, _ = env.reset()
|
| 54 |
+
done = False
|
| 55 |
+
while not done:
|
| 56 |
+
action, _ = model.predict(obs, deterministic=True)
|
| 57 |
+
obs, reward, terminated, truncated, _ =_
|