Franka Cube-Push — Isaac Lab PPO Policy

A robotic manipulation policy trained in NVIDIA Isaac Lab 2.3.2 (Isaac Sim 5.1) with rsl_rl PPO across 4,096 GPU-parallel environments on a single RTX 4090.

The task is a custom Isaac-Push-Cube-Franka-v0 environment authored from scratch in Isaac Lab's manager-based framework — requires a Franka Panda arm to push a cube to a sampled target position on a table.

Performance

~23% success rate (5 cm position threshold) on held-out goals.

Architecture

Actor/Critic MLP: [256, 128, 64], ELU activations.
Observations (36-dim): joint pos/vel, object position, target position, last action — with observation normalization.
Actions (8-dim): 7 arm joints + gripper, joint-position control.

Training config

PPO, entropy_coef=0.002, observation normalization on, 4,096 envs, 1,500 iterations. Rewards: end-effector reaching, goal-distance tracking (coarse + fine), sparse success bonus, light action penalties.

Usage

Load in Isaac Lab with the matching task. Full task code, training commands, reward-shaping ablation, and the debugging narrative:

Code & writeup: https://github.com/IAmHassanMehmood/Issac-Lab-RL

Honest limitations

Open-gripper pushing is inherently imprecise; success plateaus in the low-20s%. The value of this artifact is a * custom task and systematic debugging methodology*, not a saturated benchmark number.

Downloads last month: -; Downloads are not tracked for this model. How to track

Video Preview

Reinforcement Learning