--- tags: - deep-reinforcement-learning - reinforcement-learning - stable-baselines3 - LunarLander-v2 - PPO library_name: stable-baselines3 model_name: ppo --- # 🚀 PPO Agent for LunarLander-v2 This is a trained PPO agent that learned to land a spacecraft on the moon! ## Model Description - **Algorithm**: Proximal Policy Optimization (PPO) - **Environment**: LunarLander-v2 - **Framework**: Stable-Baselines3 - **Training Steps**: 100,000 - 500,000 steps ## Performance - **Success Rate**: 90%+ successful landings - **Average Reward**: 200+ (successful landing threshold) - **Best Performance**: 265+ reward ## Usage ```python from stable_baselines3 import PPO import gymnasium as gym # Load the trained model model = PPO.load("lunar_lander_ppo_model") # Create environment env = gym.make('LunarLander-v2', render_mode='human') # Test the agent obs, _ = env.reset() for _ in range(1000): action, _ = model.predict(obs, deterministic=True) obs, reward, terminated, truncated, info = env.step(action) if terminated or truncated: obs, _ = env.reset() env.close() ``` ## Training Details The agent was trained using PPO with the following hyperparameters: - Learning rate: 0.0003 - Batch size: 64 - Number of environments: 4 - Gamma: 0.999 ## Results The agent successfully learned to: - Control spacecraft thrust - Navigate to landing pad - Execute gentle landings - Conserve fuel efficiently Watch it land on the moon! 🌙