Galaxea Gearbox Assembly R1 Policies

This repository contains the trained Reinforcement Learning (RL) policies for the high-precision gearbox assembly task using the Galaxea R1 robot. These models were trained using NVIDIA Isaac Lab on a single NVIDIA RTX 5090, achieving state-of-the-art simulation throughput and convergence stability.

Model Description

The policies are trained to control a 7-DoF robotic arm (Galaxea R1) to assemble a complex planetary gearbox. The task is decomposed into sequential sub-tasks: Approach -> Grasp -> Transport (for each gear).

Algorithm: PPO (Proximal Policy Optimization) via rl_games
Observation Space: 69-dim (Joint pos/vel, EE pose, Relative gear targets)
Action Space: 14-dim (Joint position targets + Gripper)
Training Framework: Isaac Lab (DirectRL Mode)

Performance Metrics

The models were trained with a massive throughput of ~8,200 FPS (Frames Per Second) using full GPU vectorization.

Policy	Stage	Avg Reward	Critic Loss	Entropy	Status
Approach	1 (Foundation)	~241.4	3.8e-5	2.58	Converged
Grasp	2 (Manipulation)	~240.9	3.3e-5	-0.92	Converged
Transport 1	3 (Assembly)	~282.6	1.7e-4	11.2	Robust

Included Files

policy_approach.pth: PyTorch checkpoint for the Approach phase.
policy_grasp.pth: PyTorch checkpoint for the Grasping phase.
policy_transport_gear_1.pth: PyTorch checkpoint for Transporting the first Sun Gear.
env_config.py: The environment configuration used for training (PhysX settings, rewards).
agent_config.yaml: The PPO hyperparameters.

Usage

These policies are designed to be loaded into the Isaac Lab environment:

# Pseudo-code for loading
from rl_games.torch_runner import Runner

runner = Runner()
runner.load('policy_approach.pth')
# ... run inference ...

Hardware Specification

GPU: NVIDIA GeForce RTX 5090 (32GB)
Training Time: ~3 hours per policy (Optimized from 50+ days)
Simultaneous Envs: 8,192

Downloads last month: -; Downloads are not tracked for this model. How to track

Video Preview

Reinforcement Learning