Franka Cube-Push โ Isaac Lab PPO Policy
A robotic manipulation policy trained in NVIDIA Isaac Lab 2.3.2 (Isaac Sim 5.1) with rsl_rl PPO across 4,096 GPU-parallel environments on a single RTX 4090.
The task is a custom Isaac-Push-Cube-Franka-v0 environment authored from scratch in Isaac Lab's manager-based framework โ requires a Franka Panda arm to push a cube to a sampled target position on a table.
Performance
- ~23% success rate (5 cm position threshold) on held-out goals.
Architecture
- Actor/Critic MLP:
[256, 128, 64], ELU activations. - Observations (36-dim): joint pos/vel, object position, target position, last action โ with observation normalization.
- Actions (8-dim): 7 arm joints + gripper, joint-position control.
Training config
PPO, entropy_coef=0.002, observation normalization on, 4,096 envs, 1,500 iterations. Rewards: end-effector reaching, goal-distance tracking (coarse + fine), sparse success bonus, light action penalties.
Usage
Load in Isaac Lab with the matching task. Full task code, training commands, reward-shaping ablation, and the debugging narrative:
Code & writeup: https://github.com/IAmHassanMehmood/Issac-Lab-RL
Honest limitations
Open-gripper pushing is inherently imprecise; success plateaus in the low-20s%. The value of this artifact is a * custom task and systematic debugging methodology*, not a saturated benchmark number.