File size: 730 Bytes
9c3d747 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
# PPO CartPole Agent
This is a PPO agent trained on the CartPole-v1 environment using Stable Baselines3.
## Performance
The agent achieved a mean reward of 500.00 ± 0.00 over 10 evaluation episodes.
## Training Details
- Algorithm: PPO
- Environment: CartPole-v1
- Training Steps: 25,000
- Framework: Stable Baselines3
## Usage
```python
from stable_baselines3 import PPO
import gymnasium as gym
# Load the model
model = PPO.load("drap/cartpole-ppo")
# Create environment
env = gym.make("CartPole-v1")
# Test the model
obs, _ = env.reset()
while True:
action, _ = model.predict(obs, deterministic=True)
obs, reward, terminated, truncated, _ = env.step(action)
if terminated or truncated:
break
```
|