nirmanpatel commited on
Commit
9322bc2
·
verified ·
1 Parent(s): 47c65f4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +97 -7
README.md CHANGED
@@ -4,7 +4,11 @@ tags:
4
  - PandaReachDense-v3
5
  - deep-reinforcement-learning
6
  - reinforcement-learning
 
7
  - stable-baselines3
 
 
 
8
  model-index:
9
  - name: A2C
10
  results:
@@ -21,17 +25,103 @@ model-index:
21
  verified: false
22
  ---
23
 
24
- # **A2C** Agent playing **PandaReachDense-v3**
25
- This is a trained model of a **A2C** agent playing **PandaReachDense-v3**
26
- using the [stable-baselines3 library](https://github.com/DLR-RM/stable-baselines3).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
27
 
28
- ## Usage (with Stable-baselines3)
29
- TODO: Add your code
30
 
 
 
 
 
 
31
 
32
  ```python
33
- from stable_baselines3 import ...
 
34
  from huggingface_sb3 import load_from_hub
35
 
36
- ...
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  - PandaReachDense-v3
5
  - deep-reinforcement-learning
6
  - reinforcement-learning
7
+ - robotics
8
  - stable-baselines3
9
+ - gymnasium
10
+ - panda-gym
11
+
12
  model-index:
13
  - name: A2C
14
  results:
 
25
  verified: false
26
  ---
27
 
28
+ # A2C Agent for PandaReachDense-v3
29
+
30
+ This repository contains a trained **Advantage Actor-Critic (A2C)** agent for the **PandaReachDense-v3** robotics environment from Panda-Gym.
31
+
32
+ The agent was trained using:
33
+ - Stable-Baselines3
34
+ - Gymnasium
35
+ - Panda-Gym
36
+
37
+ ## Environment
38
+
39
+ The task involves controlling a Franka Panda robotic arm to reach a target position in 3D space.
40
+
41
+ Environment:
42
+ - PandaReachDense-v3
43
+
44
+ Frameworks:
45
+ - Stable-Baselines3
46
+ - Gymnasium
47
+ - Panda-Gym
48
+
49
+ ---
50
+
51
+ ## Training Details
52
+
53
+ Algorithm:
54
+ - A2C (Advantage Actor-Critic)
55
+
56
+ Observation Space:
57
+ - Continuous
58
+
59
+ Action Space:
60
+ - Continuous robotic control
61
+
62
+ Reward Type:
63
+ - Dense reward
64
+
65
+ Evaluation Reward:
66
+ - Mean Reward: `-17.94 +/- 6.03`
67
+
68
+ ---
69
+
70
+ ## Usage
71
 
72
+ Install dependencies:
 
73
 
74
+ ```bash
75
+ pip install stable-baselines3 gymnasium panda-gym huggingface_sb3
76
+ ```
77
+
78
+ Load the model:
79
 
80
  ```python
81
+ import gymnasium as gym
82
+ from stable_baselines3 import A2C
83
  from huggingface_sb3 import load_from_hub
84
 
85
+ repo_id = "nirmanpatel/a2c-PandaReachDense-v3"
86
+ filename = "a2c-PandaReachDense-v3.zip"
87
+
88
+ checkpoint = load_from_hub(
89
+ repo_id=repo_id,
90
+ filename=filename,
91
+ )
92
+
93
+ env = gym.make("PandaReachDense-v3")
94
+
95
+ model = A2C.load(checkpoint)
96
+
97
+ obs, info = env.reset()
98
+
99
+ for _ in range(1000):
100
+ action, _states = model.predict(obs, deterministic=True)
101
+ obs, reward, terminated, truncated, info = env.step(action)
102
+
103
+ if terminated or truncated:
104
+ obs, info = env.reset()
105
  ```
106
+
107
+ ---
108
+
109
+ ## Replay Video
110
+
111
+ - `agent-step-0-to-step-1000.mp4`
112
+
113
+ ---
114
+
115
+ ## Notes
116
+
117
+ This project demonstrates:
118
+ - Reinforcement Learning for robotics
119
+ - Continuous control using A2C
120
+ - Gymnasium-compatible RL pipelines
121
+ - Hugging Face model deployment
122
+
123
+ ---
124
+
125
+ ## Author
126
+
127
+ Created by Nirman Patel