sam522 commited on
Commit
1349781
·
verified ·
1 Parent(s): bc8ff46

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +68 -0
README.md ADDED
@@ -0,0 +1,68 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - ML-Agents-SnowballTarget
4
+ - ppo
5
+ - deep-reinforcement-learning
6
+ - reinforcement-learning
7
+ - ml-agents
8
+ model-index:
9
+ - name: PPO
10
+ results:
11
+ - task:
12
+ type: reinforcement-learning
13
+ name: reinforcement-learning
14
+ dataset:
15
+ name: ML-Agents-SnowballTarget
16
+ type: ML-Agents-SnowballTarget
17
+ metrics:
18
+ - type: mean_reward
19
+ value: 26.02 +/- 2.14
20
+ name: mean_reward
21
+ verified: false
22
+ ---
23
+
24
+ # **PPO** Agent playing **ML-Agents-SnowballTarget**
25
+
26
+ This is a trained model of a **PPO** agent playing **ML-Agents-SnowballTarget** using Unity ML-Agents.
27
+
28
+ ## Usage
29
+
30
+ Download model and play it:
31
+
32
+ ```python
33
+ from mlagents_envs.environment import UnityEnvironment
34
+ from mlagents_envs.base_env import ActionTuple
35
+ import numpy as np
36
+
37
+ # Load the environment
38
+ env = UnityEnvironment(file_name="path/to/SnowballTarget")
39
+
40
+ # Reset the environment
41
+ env.reset()
42
+ behavior_names = list(env.behavior_specs)
43
+ spec = env.behavior_specs[behavior_names[0]]
44
+
45
+ # Load your trained model
46
+ # (You'll need to implement model loading based on your training framework)
47
+
48
+ # Run the environment
49
+ decision_steps, terminal_steps = env.get_steps(behavior_names[0])
50
+
51
+ while True:
52
+ # Get observations
53
+ obs = decision_steps.obs[0]
54
+
55
+ # Get action from your model
56
+ # action = your_model.predict(obs)
57
+
58
+ # Step the environment
59
+ action_tuple = ActionTuple(discrete=np.array([[action]]))
60
+ env.set_actions(behavior_names[0], action_tuple)
61
+ env.step()
62
+
63
+ decision_steps, terminal_steps = env.get_steps(behavior_names[0])
64
+
65
+ if len(terminal_steps) > 0:
66
+ break
67
+
68
+ env.close()