LuckLin commited on
Commit
8a93911
·
verified ·
1 Parent(s): ef25238

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +43 -11
README.md CHANGED
@@ -2,9 +2,11 @@
2
  library_name: stable-baselines3
3
  tags:
4
  - PandaReachDense-v3
5
- - deep-reinforcement-learning
6
  - reinforcement-learning
7
  - stable-baselines3
 
 
 
8
  model-index:
9
  - name: A2C
10
  results:
@@ -16,22 +18,52 @@ model-index:
16
  type: PandaReachDense-v3
17
  metrics:
18
  - type: mean_reward
19
- value: -0.25 +/- 0.15
20
  name: mean_reward
21
- verified: false
22
  ---
23
 
24
- # **A2C** Agent playing **PandaReachDense-v3**
25
- This is a trained model of a **A2C** agent playing **PandaReachDense-v3**
26
- using the [stable-baselines3 library](https://github.com/DLR-RM/stable-baselines3).
27
 
28
- ## Usage (with Stable-baselines3)
29
- TODO: Add your code
30
 
 
 
 
 
 
 
 
 
 
 
 
 
31
 
32
  ```python
33
- from stable_baselines3 import ...
34
  from huggingface_sb3 import load_from_hub
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
35
 
36
- ...
37
- ```
 
2
  library_name: stable-baselines3
3
  tags:
4
  - PandaReachDense-v3
 
5
  - reinforcement-learning
6
  - stable-baselines3
7
+ - a2c
8
+ - deep-rl
9
+ - panda-gym
10
  model-index:
11
  - name: A2C
12
  results:
 
18
  type: PandaReachDense-v3
19
  metrics:
20
  - type: mean_reward
21
+ value: 0.00 +/- 0.00 # 请根据你之前的 print 结果修改这里
22
  name: mean_reward
 
23
  ---
24
 
25
+ # A2C Agent playing PandaReachDense-v3
 
 
26
 
27
+ This is a trained model of an **A2C** agent playing **PandaReachDense-v3** using the [stable-baselines3](https://github.com/DLR-RM/stable-baselines3) library and the [panda-gym](https://github.com/qgallouedec/panda-gym) environment.
 
28
 
29
+ ## Video Replay
30
+
31
+ ![Replay Video](replay.mp4)
32
+
33
+ ## Usage (with huggingface_sb3)
34
+
35
+ To use this model, you need to install the following dependencies:
36
+
37
+ ```python
38
+ pip install stable-baselines3 huggingface_sb3 panda_gym shimmy
39
+
40
+ Then you can load and evaluate the model:
41
 
42
  ```python
 
43
  from huggingface_sb3 import load_from_hub
44
+ from stable_baselines3 import A2C
45
+ from stable_baselines3.common.vec_env import DummyVecEnv, VecNormalize
46
+
47
+ # Load the model and statistics
48
+ repo_id = "LuckLin/a2c-PandaReachDense-v3"
49
+ filename = "a2c-PandaReachDense-v3.zip"
50
+
51
+ checkpoint = load_from_hub(repo_id, filename)
52
+ model = A2C.load(checkpoint)
53
+
54
+ # Load the normalization statistics
55
+ stats_path = load_from_hub(repo_id, "vec_normalize.pkl")
56
+ env = DummyVecEnv([lambda: gym.make("PandaReachDense-v3")])
57
+ env = VecNormalize.load(stats_path, env)
58
+
59
+ # At test time, we don't update the stats
60
+ env.training = False
61
+ env.norm_reward = False
62
+
63
+ # Evaluate
64
+ obs = env.reset()
65
+ for _ in range(1000):
66
+ action, _states = model.predict(obs, deterministic=True)
67
+ obs, rewards, dones, info = env.step(action)
68
+ env.render()
69