Factory Feed-Forward PPO+OSC Teachers

This repository contains three non-recurrent RL-Games PPO+OSC teacher checkpoints for Isaac Lab Factory assembly tasks:

Isaac-Factory-PegInsert-Direct-v0
Isaac-Factory-GearMesh-Direct-v0
Isaac-Factory-NutThread-Direct-v0

The teachers were trained from the stock Isaac Lab Factory RL-Games PPO+OSC configuration with both actor and central-value LSTM blocks removed. They are intended as clean teacher policies for behavior cloning or DAgger-style data collection, where each action should be a function of the current policy observation only.

Architecture

Actor: 19D policy observation -> MLP [512, 128, 64], ELU -> Gaussian 6D OSC action.
Central critic: 43D privileged state -> MLP [512, 128, 64], ELU -> scalar value.
RL-Games seq_length=1.
Actor rnn=null.
Central-value network.rnn=null.

The action space is the Factory 6D operational-space-control action. These are not FORGE checkpoints and do not use the FORGE success-prediction action slot.

Checkpoints

Task	File	Stop Epoch	Training Seed
PegInsert	`checkpoints/peginsert_ff_ppo_osc_ep100.pth`	`100`	`0`
GearMesh	`checkpoints/gearmesh_ff_ppo_osc_ep150.pth`	`150`	`0`
NutThread	`checkpoints/nutthread_ff_ppo_osc_ep100.pth`	`100`	`0`

These files are raw PyTorch/RL-Games .pth checkpoints. Load them only in a trusted environment.

Evaluation

Evaluation used held-out reset seeds 100,101,102 for gates and 100,101,102,103,104,105 for confirmation.

Task	Gate Horizon	Gate Success	Six-Seed Success	Mean Return	Mean TTS
PegInsert	`150`	`376/384` (`97.92%`)	`750/768` (`97.66%`)	`350.12`	`35.72`
GearMesh	`300`	`378/384` (`98.44%`)	`750/768` (`97.66%`)	`719.53`	`75.02`
NutThread	`450`	`378/384` (`98.44%`)	`755/768` (`98.31%`)	`834.76`	`298.12`

All gate and confirmation summaries record agent_runtime.is_rnn=false, actor rnn=null, central-value network.rnn=null, and seq_length=1.

The JSON summaries are included under:

inspect/
eval/
videos/*_summary.json

Rendered Success Videos

Each video is a one-environment rollout on seed 100 using real Factory sensor cameras. The external and wrist camera blank-frame counts were zero.

PegInsert

Open PegInsert video

Success: true
First success step: 53
Frames: 150
Blank frames: 0/150 external, 0/150 wrist

GearMesh

Open GearMesh video

Success: true
First success step: 47
Frames: 300
Blank frames: 0/300 external, 0/300 wrist

NutThread

Open NutThread video

Success: true
First success step: 333
Frames: 450
Blank frames: 0/450 external, 0/450 wrist

Loading Notes

These checkpoints must be loaded with the same feed-forward RL-Games config used for training. In Hydra override form, remove both recurrent blocks and force sequence length to one:

'~agent.params.network.rnn' \
'~agent.params.config.central_value_config.network.rnn' \
agent.params.config.seq_length=1

In the source project, the runner switch was:

PHASE5_ASSEMBRAIN_DISABLE_RNN=1

Example task/checkpoint pairing:

Isaac-Factory-PegInsert-Direct-v0  -> checkpoints/peginsert_ff_ppo_osc_ep100.pth
Isaac-Factory-GearMesh-Direct-v0   -> checkpoints/gearmesh_ff_ppo_osc_ep150.pth
Isaac-Factory-NutThread-Direct-v0  -> checkpoints/nutthread_ff_ppo_osc_ep100.pth

Scope

These models are simulation teachers. They have not been validated on real hardware. They were trained for Factory PPO+OSC policy rollouts and later imitation-learning data generation, not for direct deployment.

Downloads last month: -; Downloads are not tracked for this model. How to track

Video Preview

Reinforcement Learning