| tags: | |
| - ppo | |
| - reinforcement-learning | |
| - swarm | |
| - drone | |
| - bittensor | |
| license: mit | |
| language: | |
| - en | |
| library_name: stable-baselines3 | |
| # π Swarm PPO Drone | |
| This repository contains a **Proximal Policy Optimization (PPO)** model trained for **swarm/drone control**. | |
| The model was trained using **Gymnasium environments** with Stable-Baselines3 and exported for use in **Bittensor Subnet 124 (Swarm)**. | |
| --- | |
| ## π Files | |
| - `policy.pth` β Trained PPO policy weights (PyTorch). | |
| - `ppo_policy.zip` β Stable-Baselines3 PPO saved model (reload with `PPO.load()`). | |
| - `safe_policy_meta.json` β Metadata for policy compliance. | |
| - `best/` β Best checkpointed model during training. | |
| - `eval_logs/` β Evaluation logs. | |
| - `tb_logs/` β TensorBoard logs. | |
| --- | |
| ## π οΈ Usage | |
| ### Load with Stable-Baselines3 | |
| ```python | |
| from stable_baselines3 import PPO | |
| import gymnasium as gym | |
| # Load model | |
| model = PPO.load("ppo_policy.zip") | |
| # Example run | |
| env = gym.make("CartPole-v1") | |
| obs, _ = env.reset() | |
| action, _ = model.predict(obs) | |
| print("Predicted action:", action) | |