RyanAA commited on
Commit
1f5948d
·
verified ·
1 Parent(s): 7e11a2b

created readme

Browse files
Files changed (1) hide show
  1. README.md +32 -0
README.md ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ %%writefile README.md
2
+ # PPO SnowballTarget Agent
3
+
4
+ This model was trained using Proximal Policy Optimization (PPO) with Unity ML-Agents as part of the Hugging Face Deep Reinforcement Learning Course.
5
+
6
+ ## Environment
7
+ - Unity ML-Agents
8
+ - SnowballTarget environment
9
+
10
+ ## Training Details
11
+ - Algorithm: PPO
12
+ - Total training steps: 200,000
13
+ - Final mean reward: ~23.2
14
+
15
+ ## Results
16
+ The agent learned to consistently hit targets in the SnowballTarget environment and achieved stable rewards during training.
17
+
18
+ Final training logs:
19
+ - Step 160000 → Mean Reward: 22.84
20
+ - Step 170000 → Mean Reward: 22.85
21
+ - Step 180000 → Mean Reward: 23.00
22
+ - Step 190000 → Mean Reward: 23.46
23
+ - Step 200000 → Mean Reward: 23.21
24
+
25
+ ## Files
26
+ - `SnowballTarget.onnx` — trained Unity ML-Agents policy network
27
+
28
+ ## Usage
29
+ This model can be loaded into Unity ML-Agents for inference and evaluation.
30
+
31
+ ## Author
32
+ Ryan Aparicio