Upload PI0.5 UR7e ArrangeBlock fine-tuned policy at step 5550

93a5d7d verified about 2 months ago

4.39 kB

	---
	library_name: lerobot
	pipeline_tag: robotics
	base_model: lerobot/pi05_base
	base_model_relation: finetune
	datasets:
	- CoRL2026-CSI/UR7e-CaP_arrange_block_100epi_10fps
	license: other
	tags:
	- lerobot
	- pi05
	- pi0.5
	- openpi
	- vision-language-action
	- imitation-learning
	- robotics
	- ur7e
	- arrange-block
	- 10fps
	- safetensors
	---

	# CoRL2026-CSI/Pi0.5-UR7e-ArrangeBlock_30epoch

	This is a LeRobot PI0.5 policy fine-tuned from
	[`lerobot/pi05_base`](https://huggingface.co/lerobot/pi05_base) on
	[`CoRL2026-CSI/UR7e-CaP_arrange_block_100epi_10fps`](https://huggingface.co/datasets/CoRL2026-CSI/UR7e-CaP_arrange_block_100epi_10fps).

	The model is intended for UR7e ArrangeBlock manipulation experiments using RGB
	observations, robot proprioception, language instructions, and continuous action chunks.
	It is uploaded as a LeRobot policy checkpoint and should be loaded through the matching
	LeRobot PI0.5 implementation used for training.

	## Model Details

	- Policy type: PI0.5
	- Base policy: `lerobot/pi05_base`
	- PaliGemma variant: `gemma_2b`
	- Action expert variant: `gemma_300m`
	- Action chunk size: `16`
	- Action steps: `16`
	- Max state/action dims: `32` / `32`
	- Vision encoder frozen: `false`
	- Train expert only: `false`
	- Gradient checkpointing: `true`
	- Training dtype: `bfloat16`

	## Fine-Tuning Setup

	- Dataset: `CoRL2026-CSI/UR7e-CaP_arrange_block_100epi_10fps`
	- Training steps: `5550`
	- Approx. epochs: `30.16`
	- Final training samples: `1420800`
	- Final training loss: `0.009152`
	- Runtime: `19.12 hours`
	- Per-GPU batch size: `64`
	- Gradient accumulation steps: `2`
	- Number of GPUs: `2`
	- Effective batch size: `256`
	- Optimizer lr: `2.5e-05`
	- Optimizer betas: `[0.9, 0.95]`
	- Weight decay: `0.01`
	- Scheduler warmup/decay: `1000` / `30000`
	- Final decay lr: `2.5e-06`
	- DataLoader workers: `8`
	- DataLoader prefetch factor: `1`

	## Camera Mapping

	- no explicit rename map

	## Image Augmentation

	- disabled

	## Inputs

	- `observation.images.base_0_rgb`: `STATE`, shape `[1]`
	- `observation.images.left_wrist_0_rgb`: `STATE`, shape `[1]`
	- `observation.images.realsense_topview`: `VISUAL`, shape `[3, 480, 640]`
	- `observation.images.realsense_wrist`: `VISUAL`, shape `[3, 480, 640]`
	- `observation.images.right_wrist_0_rgb`: `STATE`, shape `[1]`
	- `observation.state`: `STATE`, shape `[7]`

	## Outputs

	- `action`: `ACTION`, shape `[7]`

	## Usage

	Install and use the same LeRobot checkout/environment that contains the PI0.5 policy
	implementation, then point `policy.path` to this Hub repo.

	```bash
	lerobot-record \
	--robot.type=<your_robot> \
	--dataset.repo_id=<your_eval_dataset_repo> \
	--policy.path=CoRL2026-CSI/Pi0.5-UR7e-ArrangeBlock_30epoch \
	--episodes=10
	```

	For local Python usage, load the policy with LeRobot's policy factory from the training
	checkout.

	## Evaluation

	This upload records the offline training run metrics only. No rollout success rate is
	claimed here unless a separate real or simulated evaluation is added later.

	Final logged training metrics:

	- loss: `0.009152`
	- grad norm: `0.377971`
	- learning rate: `2.500051366567086e-06`
	- update time: `6.0738 s/step`
	- dataloading time: `0.0210 s/step`

	## Limitations and Safety

	This model is a robot control policy and can produce unsafe actions if deployed on
	hardware without appropriate validation, workspace limits, emergency stop handling, and
	task-specific safety checks. Test in simulation or a constrained setup before any
	physical deployment.

	The model is specialized to the training dataset, camera mapping, calibration, action
	space, and embodiment configuration. It may not transfer reliably to different robots,
	camera placements, object layouts, or tasks without further validation or fine-tuning.

	## License and Terms

	The training dataset is marked `apache-2.0`. This fine-tuned model is conservatively
	marked as `other`; users are responsible for checking the applicable base model,
	dataset, and deployment terms before use.

	## Files

	- `model.safetensors`: fine-tuned policy weights
	- `config.json`: LeRobot PI0.5 policy config
	- `train_config.json`: training configuration
	- `policy_preprocessor.json` and `policy_postprocessor.json`: LeRobot processor pipelines
	- `policy_*_processor.safetensors`: normalization/statistics state used by processors