🕹 PPO Agent: Playing Huggy

This repository contains a trained PPO agent for the Huggy environment, built using the Unity ML-Agents Toolkit.

📊 Training Details

Trainer type: PPO
Hyperparameters:
- Batch size: 2048
- Buffer size: 20480
- Learning rate: 0.0003 (linear schedule)
- Beta: 0.005 (linear schedule)
- Epsilon: 0.2 (linear schedule)
- Lambda: 0.95
- Epochs: 3
Network architecture:
- Hidden units: 512
- Layers: 3
- Normalization: Enabled
Reward signal: Extrinsic (γ = 0.995, strength = 1.0)
Max steps: 2,000,000
Checkpoint interval: 200,000
Exported models: ONNX checkpoints at multiple intervals (e.g., 199996, 399914, …, 2000042)

During training, the agent’s mean reward improved steadily from ~1.8 at 50k steps to ~3.9 at 1.4M+ steps, stabilizing around ~3.7–3.9 with variance ~1.9–2.0.

🚀 Usage (with ML-Agents)

Documentation: ML-Agents Toolkit Docs

Resume Training

mlagents-learn <your_config.yaml> --run-id=<run_id> --resume

Watch Your Agent Play

You can watch the agent directly in your browser:

Go to Unity models on Hugging Face.
Find this model: KraTUZen/HuggyTheStickFetcher.
Select the exported .onnx file.
Click Watch the agent play 👀.

📚 Tutorials

Short tutorial: Teach Huggy the Dog 🐶 to fetch the stick and play in-browser.
Longer tutorial: Deep dive into ML-Agents training and deployment.

Downloads last month: 93

Video Preview

Reinforcement Learning