pi05-block-transfer-lerobot

π₀.₅ fine-tune for bimanual red-cube handover / block transfer on the Trossen AI stationary (WidowX AI) platform. Trained with lerobot (LoRA, relative actions). This Run Record documents the step-40000 checkpoint.

Demo — on-robot eval

Prompt: Grab and handover the red cube to the other arm

Async on-robot evaluation of this checkpoint (step-40000) on the Trossen AI stationary bimanual platform — block transfer / red-cube handover. See RLE-34.

TL;DR

Policy: π₀.₅ — PaliGemma gemma_2b VLM + gemma_300m action expert (flow matching).
Framework: lerobot-latest (--policy.type=pi05).
Method: LoRA (r=32, α=32) on attention + MLP projections of both the VLM language model and the action expert; base = lerobot/pi05_base.
Action space: relative/delta on the 12 arm joints, absolute on the 2 grippers.
Robot: Trossen AI stationary bimanual, 14-DoF, 4 cameras, 30 fps.
Checkpoint: step 40000.

How it differs from the openpi fine-tune

Repo	Framework	Action space	Method
pi05-block-transfer-lerobot (this)	lerobot-latest	relative (arms)	LoRA
openpi `pi05_trossen_transfer_block`	openpi	absolute	LoRA

How it was trained (relative actions)

Relative actions are implemented as processor steps baked into the saved pipeline (not a model flag), so evaluation handles them automatically:

Preprocessing — delta_actions_processor (enabled: true): action -= state.
Postprocessing — absolute_actions_processor (enabled: true): action += state.
exclude_joints: [left_carriage_joint, right_carriage_joint] → the two grippers stay absolute; the 12 arm joints are relative.

The server/loader re-wires the paired steps on load (_reconnect_relative_absolute_steps), so no special flag is needed at inference.

Model configuration (`config.json`)

Field	Value
type	`pi05`
paligemma_variant / action_expert_variant	`gemma_2b` / `gemma_300m`
dtype	`bfloat16`
input_features	`observation.state` (14); images `cam_high`, `cam_low`, `cam_left_wrist`, `cam_right_wrist` (3×480×640)
output_features	`action` (14)
chunk_size / n_action_steps	50 / 50
num_inference_steps	10
n_obs_steps	1
max_state_dim / max_action_dim	32 / 32
image_resolution	224 × 224
tokenizer_max_length	200
normalization	STATE=QUANTILES, ACTION=QUANTILES, VISUAL=IDENTITY
flow time-sampling	beta(α=1.5, β=1.0), scale=0.999, offset=0.001
period (min/max)	0.004 / 4.0

Training hyperparameters

Field	Value
optimizer	AdamW
learning rate	2.5e-5
betas	(0.9, 0.95)
eps	1e-8
weight_decay	0.01
grad_clip_norm	1.0
scheduler	cosine — warmup 1000, decay_steps 30000, decay_lr 2.5e-6
gradient_checkpointing	false

LoRA (`adapter_config.json`)

Field	Value
peft_type / version	LORA / 0.19.1
r / lora_alpha / lora_dropout	32 / 32 / 0.0
bias	none
target_modules	`q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj` on PaliGemma `language_model` + `gemma_expert`
modules_to_save	`state_proj, action_in_proj, action_out_proj, time_mlp_in, time_mlp_out`
base_model	`lerobot/pi05_base`

model.safetensors in this repo is the full merged model (base + adapter, 7.9 GB); the adapter_* files are also included for reference. from_pretrained loads the merged weights directly — no separate base download needed.

Inputs (reproducibility manifest)

Dataset: TrossenRoboticsCommunity/stationary-block-transfer-lerobot-v3 @ v3.0.
Base model: lerobot/pi05_base.
Framework: lerobot-latest (--policy.type=pi05).
Batch size: 8.
Total steps: 40,000 (this is the final checkpoint).
Training env / hardware: local-5090 (run on a local RTX 5090 using the Trossen cloud pipeline).

Outputs

Weights: model.safetensors (merged), adapter_model.safetensors (LoRA).
Config: config.json, adapter_config.json.
Processors: policy_preprocessor.json (+ normalizer state), policy_postprocessor.json (+ unnormalizer state).
TensorBoard logs: not available with this checkpoint (RLE DoD item — add if recoverable).

Evaluation (async inference)

Verified to load and run via the async policy server + Trossen async client. Relative→absolute conversion confirmed active. Run from the lerobot_trossen workspace:

# Terminal A — policy server
uv run python -m lerobot.async_inference.policy_server \
  --host=127.0.0.1 --port=8080 --fps=30 --inference_latency=0.033 --obs_queue_timeout=2

# Terminal B — robot client
uv run lerobot-trossen-async-client \
  --server_address=127.0.0.1:8080 \
  --robot.type=bi_widowxai_follower_robot \
  --robot.left_arm_ip_address=192.168.1.5 --robot.right_arm_ip_address=192.168.1.4 \
  --robot.id=bimanual_follower \
  --robot.cameras='{ cam_high: {...}, cam_low: {...}, cam_left_wrist: {...}, cam_right_wrist: {...} }' \
  --task="Grab and handover the red cube to the other arm" \
  --policy_type=pi05 \
  --pretrained_name_or_path=TrossenRoboticsCommunity/pi05-block-transfer-lerobot \
  --policy_device=cuda \
  --actions_per_chunk=50 --chunk_size_threshold=0.5 --aggregate_fn_name=weighted_average

The task prompt must match training (π₀.₅ is language-conditioned).

Eval Results

Eval Task	Reps	Success	Notes
block-transfer	TODO	TODO	policy confirmed working on hardware; formal rep count TODO

Model tree for TrossenRoboticsCommunity/pi05-block-transfer-lerobot

Base model

lerobot/pi05_base

Adapter

(5)

this model

Dataset used to train TrossenRoboticsCommunity/pi05-block-transfer-lerobot

Collection including TrossenRoboticsCommunity/pi05-block-transfer-lerobot

Trossen Showcase

Collection

Curated, demo-ready Trossen manipulation policies, each with an on-robot evaluation video. • 3 items • Updated Jun 3

TrossenRoboticsCommunity
/

pi05-block-transfer-lerobot

pi05-block-transfer-lerobot

Demo — on-robot eval

TL;DR

How it differs from the openpi fine-tune

How it was trained (relative actions)

Model configuration (`config.json`)

Training hyperparameters

LoRA (`adapter_config.json`)

Inputs (reproducibility manifest)

Outputs

Evaluation (async inference)

Eval Results

Links

Model tree for TrossenRoboticsCommunity/pi05-block-transfer-lerobot

Dataset used to train TrossenRoboticsCommunity/pi05-block-transfer-lerobot

Collection including TrossenRoboticsCommunity/pi05-block-transfer-lerobot

Trossen Showcase

pi05-block-transfer-lerobot

Demo — on-robot eval

TL;DR

How it differs from the openpi fine-tune

How it was trained (relative actions)

Model configuration (config.json)

Training hyperparameters

LoRA (adapter_config.json)

Inputs (reproducibility manifest)

Outputs

Evaluation (async inference)

Eval Results

Links

Model tree for TrossenRoboticsCommunity/pi05-block-transfer-lerobot

Dataset used to train TrossenRoboticsCommunity/pi05-block-transfer-lerobot

Collection including TrossenRoboticsCommunity/pi05-block-transfer-lerobot

Model configuration (`config.json`)

LoRA (`adapter_config.json`)