RyanAA commited on
Commit
978f2cb
·
verified ·
1 Parent(s): 1f5948d

Updated README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -23
README.md CHANGED
@@ -1,32 +1,42 @@
1
- %%writefile README.md
2
- # PPO SnowballTarget Agent
 
 
 
 
 
 
 
3
 
4
- This model was trained using Proximal Policy Optimization (PPO) with Unity ML-Agents as part of the Hugging Face Deep Reinforcement Learning Course.
5
 
6
- ## Environment
7
- - Unity ML-Agents
8
- - SnowballTarget environment
9
 
10
- ## Training Details
11
- - Algorithm: PPO
12
- - Total training steps: 200,000
13
- - Final mean reward: ~23.2
14
 
15
- ## Results
16
- The agent learned to consistently hit targets in the SnowballTarget environment and achieved stable rewards during training.
17
 
18
- Final training logs:
19
- - Step 160000 → Mean Reward: 22.84
20
- - Step 170000 → Mean Reward: 22.85
21
- - Step 180000 → Mean Reward: 23.00
22
- - Step 190000 → Mean Reward: 23.46
23
- - Step 200000 → Mean Reward: 23.21
24
 
25
- ## Files
26
- - `SnowballTarget.onnx` — trained Unity ML-Agents policy network
27
 
28
  ## Usage
29
- This model can be loaded into Unity ML-Agents for inference and evaluation.
30
 
31
- ## Author
32
- Ryan Aparicio
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - reinforcement-learning
4
+ - ml-agents
5
+ - ppo
6
+ - unity
7
+ - SnowballTarget
8
+ license: mit
9
+ ---
10
 
11
+ # PPO SnowballTarget
12
 
13
+ This is a trained PPO agent playing SnowballTarget using Unity ML-Agents.
 
 
14
 
15
+ ## Environment
16
+ SnowballTarget
 
 
17
 
18
+ ## Algorithm
19
+ PPO (Proximal Policy Optimization)
20
 
21
+ ## Training Results
 
 
 
 
 
22
 
23
+ Final mean reward: ~23.2 after 200k training steps.
 
24
 
25
  ## Usage
 
26
 
27
+ You can watch the agent play directly in your browser:
28
+
29
+ 1. Go to:
30
+ https://huggingface.co/spaces/ThomasSimonini/ML-Agents-SnowballTarget
31
+
32
+ 2. In the model selector, enter:
33
+
34
+ RyanAA/ppo-SnowballTarget
35
+
36
+ 3. Select `SnowballTarget.onnx`
37
+
38
+ 4. Click "Watch the agent play"
39
+
40
+ ## Files
41
+
42
+ - `SnowballTarget.onnx` — trained policy network