# PPO CartPole Agent This is a PPO agent trained on the CartPole-v1 environment using Stable Baselines3. ## Performance The agent achieved a mean reward of 500.00 ± 0.00 over 10 evaluation episodes. ## Training Details - Algorithm: PPO - Environment: CartPole-v1 - Training Steps: 25,000 - Framework: Stable Baselines3 ## Usage ```python from stable_baselines3 import PPO import gymnasium as gym # Load the model model = PPO.load("drap/cartpole-ppo") # Create environment env = gym.make("CartPole-v1") # Test the model obs, _ = env.reset() while True: action, _ = model.predict(obs, deterministic=True) obs, reward, terminated, truncated, _ = env.step(action) if terminated or truncated: break ```