πΉ PPO Agent: Playing Huggy
This repository contains a trained PPO agent for the Huggy environment, built using the Unity ML-Agents Toolkit.
π Training Details
- Trainer type: PPO
- Hyperparameters:
- Batch size: 2048
- Buffer size: 20480
- Learning rate: 0.0003 (linear schedule)
- Beta: 0.005 (linear schedule)
- Epsilon: 0.2 (linear schedule)
- Lambda: 0.95
- Epochs: 3
- Network architecture:
- Hidden units: 512
- Layers: 3
- Normalization: Enabled
- Reward signal: Extrinsic (Ξ³ = 0.995, strength = 1.0)
- Max steps: 2,000,000
- Checkpoint interval: 200,000
- Exported models: ONNX checkpoints at multiple intervals (e.g., 199996, 399914, β¦, 2000042)
During training, the agentβs mean reward improved steadily from ~1.8 at 50k steps to ~3.9 at 1.4M+ steps, stabilizing around ~3.7β3.9 with variance ~1.9β2.0.
π Usage (with ML-Agents)
Documentation: ML-Agents Toolkit Docs
Resume Training
mlagents-learn <your_config.yaml> --run-id=<run_id> --resume
Watch Your Agent Play
You can watch the agent directly in your browser:
- Go to Unity models on Hugging Face.
- Find this model: KraTUZen/HuggyTheStickFetcher.
- Select the exported
.onnxfile. - Click Watch the agent play π.
π Tutorials
- Short tutorial: Teach Huggy the Dog πΆ to fetch the stick and play in-browser.
- Longer tutorial: Deep dive into ML-Agents training and deployment.
- Downloads last month
- 93