pi05-block-transfer-lerobot

π₀.₅ fine-tune for bimanual red-cube handover / block transfer on the Trossen AI stationary (WidowX AI) platform. Trained with lerobot (LoRA, relative actions). This Run Record documents the step-40000 checkpoint.

Demo — on-robot eval

Prompt
Grab and handover the red cube to the other arm

Async on-robot evaluation of this checkpoint (step-40000) on the Trossen AI stationary bimanual platform — block transfer / red-cube handover. See RLE-34.

TL;DR

  • Policy: π₀.₅ — PaliGemma gemma_2b VLM + gemma_300m action expert (flow matching).
  • Framework: lerobot-latest (--policy.type=pi05).
  • Method: LoRA (r=32, α=32) on attention + MLP projections of both the VLM language model and the action expert; base = lerobot/pi05_base.
  • Action space: relative/delta on the 12 arm joints, absolute on the 2 grippers.
  • Robot: Trossen AI stationary bimanual, 14-DoF, 4 cameras, 30 fps.
  • Checkpoint: step 40000.

How it differs from the openpi fine-tune

Repo Framework Action space Method
pi05-block-transfer-lerobot (this) lerobot-latest relative (arms) LoRA
openpi pi05_trossen_transfer_block openpi absolute LoRA

How it was trained (relative actions)

Relative actions are implemented as processor steps baked into the saved pipeline (not a model flag), so evaluation handles them automatically:

  • Preprocessingdelta_actions_processor (enabled: true): action -= state.
  • Postprocessingabsolute_actions_processor (enabled: true): action += state.
  • exclude_joints: [left_carriage_joint, right_carriage_joint] → the two grippers stay absolute; the 12 arm joints are relative.

The server/loader re-wires the paired steps on load (_reconnect_relative_absolute_steps), so no special flag is needed at inference.

Model configuration (config.json)

Field Value
type pi05
paligemma_variant / action_expert_variant gemma_2b / gemma_300m
dtype bfloat16
input_features observation.state (14); images cam_high, cam_low, cam_left_wrist, cam_right_wrist (3×480×640)
output_features action (14)
chunk_size / n_action_steps 50 / 50
num_inference_steps 10
n_obs_steps 1
max_state_dim / max_action_dim 32 / 32
image_resolution 224 × 224
tokenizer_max_length 200
normalization STATE=QUANTILES, ACTION=QUANTILES, VISUAL=IDENTITY
flow time-sampling beta(α=1.5, β=1.0), scale=0.999, offset=0.001
period (min/max) 0.004 / 4.0

Training hyperparameters

Field Value
optimizer AdamW
learning rate 2.5e-5
betas (0.9, 0.95)
eps 1e-8
weight_decay 0.01
grad_clip_norm 1.0
scheduler cosine — warmup 1000, decay_steps 30000, decay_lr 2.5e-6
gradient_checkpointing false

LoRA (adapter_config.json)

Field Value
peft_type / version LORA / 0.19.1
r / lora_alpha / lora_dropout 32 / 32 / 0.0
bias none
target_modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj on PaliGemma language_model + gemma_expert
modules_to_save state_proj, action_in_proj, action_out_proj, time_mlp_in, time_mlp_out
base_model lerobot/pi05_base

model.safetensors in this repo is the full merged model (base + adapter, 7.9 GB); the adapter_* files are also included for reference. from_pretrained loads the merged weights directly — no separate base download needed.

Inputs (reproducibility manifest)

  • Dataset: TrossenRoboticsCommunity/stationary-block-transfer-lerobot-v3 @ v3.0.
  • Base model: lerobot/pi05_base.
  • Framework: lerobot-latest (--policy.type=pi05).
  • Batch size: 8.
  • Total steps: 40,000 (this is the final checkpoint).
  • Training env / hardware: local-5090 (run on a local RTX 5090 using the Trossen cloud pipeline).

Outputs

  • Weights: model.safetensors (merged), adapter_model.safetensors (LoRA).
  • Config: config.json, adapter_config.json.
  • Processors: policy_preprocessor.json (+ normalizer state), policy_postprocessor.json (+ unnormalizer state).
  • TensorBoard logs: not available with this checkpoint (RLE DoD item — add if recoverable).

Evaluation (async inference)

Verified to load and run via the async policy server + Trossen async client. Relative→absolute conversion confirmed active. Run from the lerobot_trossen workspace:

# Terminal A — policy server
uv run python -m lerobot.async_inference.policy_server \
  --host=127.0.0.1 --port=8080 --fps=30 --inference_latency=0.033 --obs_queue_timeout=2

# Terminal B — robot client
uv run lerobot-trossen-async-client \
  --server_address=127.0.0.1:8080 \
  --robot.type=bi_widowxai_follower_robot \
  --robot.left_arm_ip_address=192.168.1.5 --robot.right_arm_ip_address=192.168.1.4 \
  --robot.id=bimanual_follower \
  --robot.cameras='{ cam_high: {...}, cam_low: {...}, cam_left_wrist: {...}, cam_right_wrist: {...} }' \
  --task="Grab and handover the red cube to the other arm" \
  --policy_type=pi05 \
  --pretrained_name_or_path=TrossenRoboticsCommunity/pi05-block-transfer-lerobot \
  --policy_device=cuda \
  --actions_per_chunk=50 --chunk_size_threshold=0.5 --aggregate_fn_name=weighted_average

The task prompt must match training (π₀.₅ is language-conditioned).

Eval Results

Eval Task Reps Success Notes
block-transfer TODO TODO policy confirmed working on hardware; formal rep count TODO

Links

Downloads last month
62
Safetensors
Model size
4B params
Tensor type
BF16
·
Video Preview
loading

Model tree for TrossenRoboticsCommunity/pi05-block-transfer-lerobot

Adapter
(5)
this model

Dataset used to train TrossenRoboticsCommunity/pi05-block-transfer-lerobot

Collection including TrossenRoboticsCommunity/pi05-block-transfer-lerobot