--- library_name: stable-baselines3 tags: - PandaReachDense-v3 - deep-reinforcement-learning - reinforcement-learning - robotics - stable-baselines3 - gymnasium - panda-gym model-index: - name: A2C results: - task: type: reinforcement-learning name: reinforcement-learning dataset: name: PandaReachDense-v3 type: PandaReachDense-v3 metrics: - type: mean_reward value: -17.94 +/- 6.03 name: mean_reward verified: false --- # A2C Agent for PandaReachDense-v3 This repository contains a trained **Advantage Actor-Critic (A2C)** agent for the **PandaReachDense-v3** robotics environment from Panda-Gym. The agent was trained using: - Stable-Baselines3 - Gymnasium - Panda-Gym ## Environment The task involves controlling a Franka Panda robotic arm to reach a target position in 3D space. Environment: - PandaReachDense-v3 Frameworks: - Stable-Baselines3 - Gymnasium - Panda-Gym --- ## Training Details Algorithm: - A2C (Advantage Actor-Critic) Observation Space: - Continuous Action Space: - Continuous robotic control Reward Type: - Dense reward Evaluation Reward: - Mean Reward: `-17.94 +/- 6.03` --- ## Usage Install dependencies: ```bash pip install stable-baselines3 gymnasium panda-gym huggingface_sb3 ``` Load the model: ```python import gymnasium as gym from stable_baselines3 import A2C from huggingface_sb3 import load_from_hub repo_id = "nirmanpatel/a2c-PandaReachDense-v3" filename = "a2c-PandaReachDense-v3.zip" checkpoint = load_from_hub( repo_id=repo_id, filename=filename, ) env = gym.make("PandaReachDense-v3") model = A2C.load(checkpoint) obs, info = env.reset() for _ in range(1000): action, _states = model.predict(obs, deterministic=True) obs, reward, terminated, truncated, info = env.step(action) if terminated or truncated: obs, info = env.reset() ``` --- ## Notes This project demonstrates: - Reinforcement Learning for robotics - Continuous control using A2C - Gymnasium-compatible RL pipelines - Hugging Face model deployment --- ## Author Created by Nirman Patel