| | --- |
| | tags: |
| | - deep-reinforcement-learning |
| | - reinforcement-learning |
| | - stable-baselines3 |
| | - BipedalWalker-v3 |
| | - PPO |
| | - SAC |
| | library_name: stable-baselines3 |
| | model_name: ppo |
| | --- |
| | |
| | # 🤖 PPO/SAC Agent for BipedalWalker-v3 |
| |
|
| | This is a trained agent that learned to walk on two legs from scratch! |
| |
|
| | ## Model Description |
| |
|
| | - **Algorithm**: PPO or SAC (Soft Actor-Critic) |
| | - **Environment**: BipedalWalker-v3 |
| | - **Framework**: Stable-Baselines3 |
| | - **Training Steps**: 500,000 steps |
| |
|
| | ## Performance |
| |
|
| | - **Walking Success**: Consistent bipedal locomotion |
| | - **Average Reward**: 200+ (successful walking) |
| | - **Coordination**: Learned proper leg coordination and balance |
| |
|
| | ## Usage |
| |
|
| | ```python |
| | from stable_baselines3 import PPO |
| | import gymnasium as gym |
| | |
| | # Load the trained model |
| | model = PPO.load("bipedal_walker_ppo_model") |
| | |
| | # Create environment |
| | env = gym.make('BipedalWalker-v3', render_mode='human') |
| | |
| | # Watch it walk! |
| | obs, _ = env.reset() |
| | for _ in range(2000): |
| | action, _ = model.predict(obs, deterministic=True) |
| | obs, reward, terminated, truncated, info = env.step(action) |
| | if terminated or truncated: |
| | obs, _ = env.reset() |
| | |
| | env.close() |
| | ``` |
| |
|
| | ## Training Details |
| |
|
| | The agent learned to coordinate: |
| | - 4 continuous joint controls (hip + knee for each leg) |
| | - Balance and momentum management |
| | - Forward locomotion |
| | - Obstacle navigation |
| |
|
| | ## What Makes This Impressive |
| |
|
| | - **24-dimensional state space** - Complex sensory input |
| | - **Continuous control** - Smooth joint movements |
| | - **Physics simulation** - Realistic walking dynamics |
| | - **From scratch learning** - No pre-programmed walking patterns |
| |
|
| | Amazing to watch a robot learn to walk! 🚶♂️ |
| |
|