|
|
--- |
|
|
tags: |
|
|
- deep-reinforcement-learning |
|
|
- reinforcement-learning |
|
|
- stable-baselines3 |
|
|
- BipedalWalker-v3 |
|
|
- PPO |
|
|
- SAC |
|
|
library_name: stable-baselines3 |
|
|
model_name: ppo |
|
|
--- |
|
|
|
|
|
# 🤖 PPO/SAC Agent for BipedalWalker-v3 |
|
|
|
|
|
This is a trained agent that learned to walk on two legs from scratch! |
|
|
|
|
|
## Model Description |
|
|
|
|
|
- **Algorithm**: PPO or SAC (Soft Actor-Critic) |
|
|
- **Environment**: BipedalWalker-v3 |
|
|
- **Framework**: Stable-Baselines3 |
|
|
- **Training Steps**: 500,000 steps |
|
|
|
|
|
## Performance |
|
|
|
|
|
- **Walking Success**: Consistent bipedal locomotion |
|
|
- **Average Reward**: 200+ (successful walking) |
|
|
- **Coordination**: Learned proper leg coordination and balance |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from stable_baselines3 import PPO |
|
|
import gymnasium as gym |
|
|
|
|
|
# Load the trained model |
|
|
model = PPO.load("bipedal_walker_ppo_model") |
|
|
|
|
|
# Create environment |
|
|
env = gym.make('BipedalWalker-v3', render_mode='human') |
|
|
|
|
|
# Watch it walk! |
|
|
obs, _ = env.reset() |
|
|
for _ in range(2000): |
|
|
action, _ = model.predict(obs, deterministic=True) |
|
|
obs, reward, terminated, truncated, info = env.step(action) |
|
|
if terminated or truncated: |
|
|
obs, _ = env.reset() |
|
|
|
|
|
env.close() |
|
|
``` |
|
|
|
|
|
## Training Details |
|
|
|
|
|
The agent learned to coordinate: |
|
|
- 4 continuous joint controls (hip + knee for each leg) |
|
|
- Balance and momentum management |
|
|
- Forward locomotion |
|
|
- Obstacle navigation |
|
|
|
|
|
## What Makes This Impressive |
|
|
|
|
|
- **24-dimensional state space** - Complex sensory input |
|
|
- **Continuous control** - Smooth joint movements |
|
|
- **Physics simulation** - Realistic walking dynamics |
|
|
- **From scratch learning** - No pre-programmed walking patterns |
|
|
|
|
|
Amazing to watch a robot learn to walk! 🚶♂️ |
|
|
|