SnowballTarget PPO Model
Trained with ML-Agents (release 21) on SnowballTarget environment.
- Mean Reward: ~28–29 (at 1M steps)
- Training Steps: 1,000,000
- Training Time: ~26 minutes
- Environments: 4 (CPU)
- Configuration: See SnowballTarget.yaml
Repository: AMZ2004/SnowballTarget-2025-08-03