---
tags:
- deep-reinforcement-learning
- reinforcement-learning
- stable-baselines3
- LunarLander-v2
- PPO
library_name: stable-baselines3
model_name: ppo
---

# 🚀 PPO Agent for LunarLander-v2

This is a trained PPO agent that learned to land a spacecraft on the moon!

## Model Description

- **Algorithm**: Proximal Policy Optimization (PPO)
- **Environment**: LunarLander-v2
- **Framework**: Stable-Baselines3
- **Training Steps**: 100,000 - 500,000 steps

## Performance

- **Success Rate**: 90%+ successful landings
- **Average Reward**: 200+ (successful landing threshold)
- **Best Performance**: 265+ reward

## Usage

```python
from stable_baselines3 import PPO
import gymnasium as gym

# Load the trained model
model = PPO.load("lunar_lander_ppo_model")

# Create environment
env = gym.make('LunarLander-v2', render_mode='human')

# Test the agent
obs, _ = env.reset()
for _ in range(1000):
    action, _ = model.predict(obs, deterministic=True)
    obs, reward, terminated, truncated, info = env.step(action)
    if terminated or truncated:
        obs, _ = env.reset()

env.close()
```

## Training Details

The agent was trained using PPO with the following hyperparameters:
- Learning rate: 0.0003
- Batch size: 64
- Number of environments: 4
- Gamma: 0.999

## Results

The agent successfully learned to:
- Control spacecraft thrust
- Navigate to landing pad
- Execute gentle landings
- Conserve fuel efficiently

Watch it land on the moon! 🌙