Francesco-A commited on
Commit
9ca9abb
·
1 Parent(s): 5e88248

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +67 -16
README.md CHANGED
@@ -5,6 +5,7 @@ tags:
5
  - deep-reinforcement-learning
6
  - reinforcement-learning
7
  - ML-Agents-SnowballTarget
 
8
  ---
9
 
10
  # **ppo** Agent playing **SnowballTarget**
@@ -14,22 +15,72 @@ tags:
14
  ## Usage (with ML-Agents)
15
  The Documentation: https://unity-technologies.github.io/ml-agents/ML-Agents-Toolkit-Documentation/
16
 
17
- We wrote a complete tutorial to learn to train your first agent using ML-Agents and publish it to the Hub:
18
- - A *short tutorial* where you teach Huggy the Dog 🐶 to fetch the stick and then play with him directly in your
19
- browser: https://huggingface.co/learn/deep-rl-course/unitbonus1/introduction
20
- - A *longer tutorial* to understand how works ML-Agents:
21
- https://huggingface.co/learn/deep-rl-course/unit5/introduction
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
 
23
- ### Resume the training
24
- ```bash
25
- mlagents-learn <your_configuration_file_path.yaml> --run-id=<run_id> --resume
26
- ```
27
 
28
- ### Watch your Agent play
29
- You can watch your agent **playing directly in your browser**
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
 
31
- 1. If the environment is part of ML-Agents official environments, go to https://huggingface.co/unity
32
- 2. Step 1: Find your model_id: Francesco-A/ppo-SnowballTarget-v1
33
- 3. Step 2: Select your *.nn /*.onnx file
34
- 4. Click on Watch the agent play 👀
35
-
 
 
 
5
  - deep-reinforcement-learning
6
  - reinforcement-learning
7
  - ML-Agents-SnowballTarget
8
+ license: apache-2.0
9
  ---
10
 
11
  # **ppo** Agent playing **SnowballTarget**
 
15
  ## Usage (with ML-Agents)
16
  The Documentation: https://unity-technologies.github.io/ml-agents/ML-Agents-Toolkit-Documentation/
17
 
18
+ ## Training hyperparameters
19
+
20
+ ```python
21
+ behaviors:
22
+ SnowballTarget:
23
+ trainer_type: ppo
24
+ summary_freq: 10000
25
+ keep_checkpoints: 10
26
+ checkpoint_interval: 55000
27
+ max_steps: 250000
28
+ time_horizon: 64
29
+ threaded: true
30
+ hyperparameters:
31
+ learning_rate: 0.0003
32
+ learning_rate_schedule: linear
33
+ batch_size: 128
34
+ buffer_size: 2048
35
+ beta: 0.005
36
+ epsilon: 0.2
37
+ lambd: 0.95
38
+ num_epoch: 3
39
+ network_settings:
40
+ normalize: false
41
+ hidden_units: 256
42
+ num_layers: 2
43
+ vis_encode_type: simple
44
+ reward_signals:
45
+ extrinsic:
46
+ gamma: 0.99
47
+ strength: 1.0
48
+ ```
49
 
50
+ ## Training details
 
 
 
51
 
52
+ | Step | Time Elapsed | Mean Reward | Std of Reward | Status |
53
+ |---------|--------------|-------------|---------------|-----------|
54
+ | 10000 | 29.079 s | 3.636 | 1.746 | Training |
55
+ | 20000 | 55.042 s | 7.164 | 2.661 | Training |
56
+ | 30000 | 77.884 s | 9.818 | 2.534 | Training |
57
+ | 40000 | 103.229 s | 11.509 | 2.263 | Training |
58
+ | 50000 | 127.046 s | 14.659 | 2.495 | Training |
59
+ | 60000 | 150.811 s | 15.655 | 2.414 | Training |
60
+ | 70000 | 174.292 s | 16.955 | 2.540 | Training |
61
+ | 80000 | 198.938 s | 18.091 | 2.481 | Training |
62
+ | 90000 | 221.915 s | 19.182 | 3.143 | Training |
63
+ | 100000 | 246.203 s | 21.182 | 2.724 | Training |
64
+ | 110000 | 271.024 s | 22.463 | 2.250 | Training |
65
+ | 120000 | 292.551 s | 24.044 | 2.190 | Training |
66
+ | 130000 | 317.539 s | 24.291 | 2.103 | Training |
67
+ | 140000 | 340.057 s | 24.455 | 4.423 | Training |
68
+ | 150000 | 366.645 s | 25.236 | 2.358 | Training |
69
+ | 160000 | 390.192 s | 25.000 | 1.895 | Training |
70
+ | 170000 | 414.326 s | 25.273 | 2.482 | Training |
71
+ | 180000 | 438.103 s | 25.750 | 1.798 | Training |
72
+ | 190000 | 462.837 s | 25.673 | 1.888 | Training |
73
+ | 200000 | 485.258 s | 25.295 | 2.380 | Training |
74
+ | 210000 | 509.542 s | 25.855 | 2.066 | Training |
75
+ | 220000 | 535.202 s | 26.111 | 1.931 | Training |
76
+ | 230000 | 556.965 s | 25.644 | 2.252 | Training |
77
+ | 240000 | 582.135 s | 26.018 | 2.673 | Training |
78
+ | 250000 | 604.248 s | 26.091 | 1.917 | Training |
79
 
80
+ ### Watch the Agent play
81
+ You can watch the agent **playing directly in your browser**
82
+
83
+ 1. Go to https://huggingface.co/spaces/ThomasSimonini/ML-Agents-SnowballTarget
84
+ 2. Step 1: Find the model_id: Francesco-A/ppo-SnowballTarget-v1
85
+ 3. Step 2: Select the *.nn /*.onnx file
86
+ 4. Click on Watch the agent play