Francesco-A
/

ppo-SnowballTarget-v1

@@ -5,6 +5,7 @@ tags:
 - deep-reinforcement-learning
 - reinforcement-learning
 - ML-Agents-SnowballTarget
 ---
   # **ppo** Agent playing **SnowballTarget**
@@ -14,22 +15,72 @@ tags:
   ## Usage (with ML-Agents)
   The Documentation: https://unity-technologies.github.io/ml-agents/ML-Agents-Toolkit-Documentation/
-  We wrote a complete tutorial to learn to train your first agent using ML-Agents and publish it to the Hub:
-  - A *short tutorial* where you teach Huggy the Dog 🐶 to fetch the stick and then play with him directly in your
-  browser: https://huggingface.co/learn/deep-rl-course/unitbonus1/introduction
-  - A *longer tutorial* to understand how works ML-Agents:
-  https://huggingface.co/learn/deep-rl-course/unit5/introduction
-  ### Resume the training
-  ```bash
-  mlagents-learn <your_configuration_file_path.yaml> --run-id=<run_id> --resume
-  ```
-  ### Watch your Agent play
-  You can watch your agent **playing directly in your browser**
-  1. If the environment is part of ML-Agents official environments, go to https://huggingface.co/unity
-  2. Step 1: Find your model_id: Francesco-A/ppo-SnowballTarget-v1
-  3. Step 2: Select your *.nn /*.onnx file
-  4. Click on Watch the agent play 👀

 - deep-reinforcement-learning
 - reinforcement-learning
 - ML-Agents-SnowballTarget
+license: apache-2.0
 ---
   # **ppo** Agent playing **SnowballTarget**
   ## Usage (with ML-Agents)
   The Documentation: https://unity-technologies.github.io/ml-agents/ML-Agents-Toolkit-Documentation/
+  ## Training hyperparameters
+```python
+behaviors:
+SnowballTarget:
+  trainer_type: ppo
+  summary_freq: 10000
+  keep_checkpoints: 10
+  checkpoint_interval: 55000
+  max_steps: 250000
+  time_horizon: 64
+  threaded: true
+  hyperparameters:
+    learning_rate: 0.0003
+    learning_rate_schedule: linear
+    batch_size: 128
+    buffer_size: 2048
+    beta: 0.005
+    epsilon: 0.2
+    lambd: 0.95
+    num_epoch: 3
+  network_settings:
+    normalize: false
+    hidden_units: 256
+    num_layers: 2
+    vis_encode_type: simple
+  reward_signals:
+    extrinsic:
+      gamma: 0.99
+      strength: 1.0
+```
+## Training details
+| Step    | Time Elapsed | Mean Reward | Std of Reward | Status    |
+|---------|--------------|-------------|---------------|-----------|
+| 10000   | 29.079 s     | 3.636       | 1.746         | Training  |
+| 20000   | 55.042 s     | 7.164       | 2.661         | Training  |
+| 30000   | 77.884 s     | 9.818       | 2.534         | Training  |
+| 40000   | 103.229 s    | 11.509      | 2.263         | Training  |
+| 50000   | 127.046 s    | 14.659      | 2.495         | Training  |
+| 60000   | 150.811 s    | 15.655      | 2.414         | Training  |
+| 70000   | 174.292 s    | 16.955      | 2.540         | Training  |
+| 80000   | 198.938 s    | 18.091      | 2.481         | Training  |
+| 90000   | 221.915 s    | 19.182      | 3.143         | Training  |
+| 100000  | 246.203 s    | 21.182      | 2.724         | Training  |
+| 110000  | 271.024 s    | 22.463      | 2.250         | Training  |
+| 120000  | 292.551 s    | 24.044      | 2.190         | Training  |
+| 130000  | 317.539 s    | 24.291      | 2.103         | Training  |
+| 140000  | 340.057 s    | 24.455      | 4.423         | Training  |
+| 150000  | 366.645 s    | 25.236      | 2.358         | Training  |
+| 160000  | 390.192 s    | 25.000      | 1.895         | Training  |
+| 170000  | 414.326 s    | 25.273      | 2.482         | Training  |
+| 180000  | 438.103 s    | 25.750      | 1.798         | Training  |
+| 190000  | 462.837 s    | 25.673      | 1.888         | Training  |
+| 200000  | 485.258 s    | 25.295      | 2.380         | Training  |
+| 210000  | 509.542 s    | 25.855      | 2.066         | Training  |
+| 220000  | 535.202 s    | 26.111      | 1.931         | Training  |
+| 230000  | 556.965 s    | 25.644      | 2.252         | Training  |
+| 240000  | 582.135 s    | 26.018      | 2.673         | Training  |
+| 250000  | 604.248 s    | 26.091      | 1.917         | Training  |
+  ### Watch the Agent play
+  You can watch the agent **playing directly in your browser**
+  1. Go to https://huggingface.co/spaces/ThomasSimonini/ML-Agents-SnowballTarget
+  2. Step 1: Find the model_id: Francesco-A/ppo-SnowballTarget-v1
+  3. Step 2: Select the *.nn /*.onnx file
+  4. Click on Watch the agent play