Galaxea Gearbox Assembly R1 Policies

This repository contains the trained Reinforcement Learning (RL) policies for the high-precision gearbox assembly task using the Galaxea R1 robot. These models were trained using NVIDIA Isaac Lab on a single NVIDIA RTX 5090, achieving state-of-the-art simulation throughput and convergence stability.

Model Description

The policies are trained to control a 7-DoF robotic arm (Galaxea R1) to assemble a complex planetary gearbox. The task is decomposed into sequential sub-tasks: Approach -> Grasp -> Transport (for each gear).

  • Algorithm: PPO (Proximal Policy Optimization) via rl_games
  • Observation Space: 69-dim (Joint pos/vel, EE pose, Relative gear targets)
  • Action Space: 14-dim (Joint position targets + Gripper)
  • Training Framework: Isaac Lab (DirectRL Mode)

Performance Metrics

The models were trained with a massive throughput of ~8,200 FPS (Frames Per Second) using full GPU vectorization.

Policy Stage Avg Reward Critic Loss Entropy Status
Approach 1 (Foundation) ~241.4 3.8e-5 2.58 Converged
Grasp 2 (Manipulation) ~240.9 3.3e-5 -0.92 Converged
Transport 1 3 (Assembly) ~282.6 1.7e-4 11.2 Robust

Included Files

  • policy_approach.pth: PyTorch checkpoint for the Approach phase.
  • policy_grasp.pth: PyTorch checkpoint for the Grasping phase.
  • policy_transport_gear_1.pth: PyTorch checkpoint for Transporting the first Sun Gear.
  • env_config.py: The environment configuration used for training (PhysX settings, rewards).
  • agent_config.yaml: The PPO hyperparameters.

Usage

These policies are designed to be loaded into the Isaac Lab environment:

# Pseudo-code for loading
from rl_games.torch_runner import Runner

runner = Runner()
runner.load('policy_approach.pth')
# ... run inference ...

Hardware Specification

  • GPU: NVIDIA GeForce RTX 5090 (32GB)
  • Training Time: ~3 hours per policy (Optimized from 50+ days)
  • Simultaneous Envs: 8,192
Downloads last month

-

Downloads are not tracked for this model. How to track
Video Preview
loading