A2C Agent for PandaReachDense-v3

This repository contains a trained Advantage Actor-Critic (A2C) agent for the PandaReachDense-v3 robotics environment from Panda-Gym.

The agent was trained using:

  • Stable-Baselines3
  • Gymnasium
  • Panda-Gym

Environment

The task involves controlling a Franka Panda robotic arm to reach a target position in 3D space.

Environment:

  • PandaReachDense-v3

Frameworks:

  • Stable-Baselines3
  • Gymnasium
  • Panda-Gym

Training Details

Algorithm:

  • A2C (Advantage Actor-Critic)

Observation Space:

  • Continuous

Action Space:

  • Continuous robotic control

Reward Type:

  • Dense reward

Evaluation Reward:

  • Mean Reward: -17.94 +/- 6.03

Usage

Install dependencies:

pip install stable-baselines3 gymnasium panda-gym huggingface_sb3

Load the model:

import gymnasium as gym
from stable_baselines3 import A2C
from huggingface_sb3 import load_from_hub

repo_id = "nirmanpatel/a2c-PandaReachDense-v3"
filename = "a2c-PandaReachDense-v3.zip"

checkpoint = load_from_hub(
    repo_id=repo_id,
    filename=filename,
)

env = gym.make("PandaReachDense-v3")

model = A2C.load(checkpoint)

obs, info = env.reset()

for _ in range(1000):
    action, _states = model.predict(obs, deterministic=True)
    obs, reward, terminated, truncated, info = env.step(action)

    if terminated or truncated:
        obs, info = env.reset()

Replay Video

Replay Video


Notes

This project demonstrates:

  • Reinforcement Learning for robotics
  • Continuous control using A2C
  • Gymnasium-compatible RL pipelines
  • Hugging Face model deployment

Author

Created by Nirman Patel

Downloads last month
-
Video Preview
loading

Evaluation results