Add model card

9c83b24 verified 6 days ago

4.21 kB

	---
	language:
	- en
	library_name: lerobot
	license: gemma
	pipeline_tag: robotics
	tags:
	- vision-language-action
	- imitation-learning
	- behavior-cloning
	- lerobot
	- pi05
	- pi0.5
	- openpi
	- robotics
	- isaaclab
	- so101
	- multi-task
	- corl2026
	- bfloat16
	- full-finetune
	- safetensors
	datasets:
	- CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi
	base_model:
	- lerobot/pi05_base
	inference: false
	---

	# Pi0.5 IsaacLab Multi-Task 1 Epoch

	This repository contains a Pi0.5 policy fine-tuned with LeRobot on the IsaacLab SO-101 multi-task dataset `CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi`.

	## Model Details

	- Base model: `lerobot/pi05_base`
	- Policy type: `pi05`
	- Training type: full fine-tuning
	- Vision encoder frozen: no
	- Action expert only: no
	- Checkpoint: final checkpoint at step `13761`
	- Training length: `1.00` epoch
	- Precision: bfloat16
	- Format: safetensors

	## Dataset

	- Dataset: `CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi`
	- Robot: SO-101 follower
	- Episodes: `3300`
	- Frames: `3,522,774`
	- Tasks: `800`
	- FPS: `30`
	- Visual inputs: `observation.images.top`, `observation.images.left_wrist`
	- State/action dimensions: 6 DoF robot state/action, padded by the Pi0.5 policy configuration as needed

	## Training Hyperparameters

	\| Setting \| Value \|
	\|---\|---:\|
	\| Steps \| `13761` \|
	\| Epochs \| `1.00` \|
	\| Per-device batch size \| `16` \|
	\| GPUs \| `2` \|
	\| Gradient accumulation \| `8` \|
	\| Effective batch size \| `256` \|
	\| Mixed precision \| `bf16` \|
	\| Policy dtype \| `bfloat16` \|
	\| Chunk size \| `16` \|
	\| Action steps \| `16` \|
	\| Gradient checkpointing \| `true` \|
	\| Compile model \| `false` \|
	\| DataLoader workers \| `8` \|
	\| DataLoader prefetch factor \| `2` \|
	\| Persistent workers \| `true` \|
	\| Pin memory \| `true` \|
	\| Preprocess in workers \| `true` \|
	\| DDP find unused parameters \| `true` \|
	\| Seed \| `1000` \|

	### Optimizer and Scheduler

	\| Setting \| Value \|
	\|---\|---:\|
	\| Optimizer \| AdamW \|
	\| Learning rate \| `2.5e-5` \|
	\| Betas \| `[0.9, 0.95]` \|
	\| Epsilon \| `1e-8` \|
	\| Weight decay \| `0.01` \|
	\| Gradient clip norm \| `1.0` \|
	\| Scheduler \| cosine decay with warmup \|
	\| Configured warmup steps \| `1000` \|
	\| Effective warmup steps \| `458` \|
	\| Configured decay steps \| `30000` \|
	\| Effective decay steps \| `13761` \|
	\| Final decay LR \| `2.5e-6` \|

	The scheduler was automatically scaled because `num_training_steps=13761` was smaller than the configured `num_decay_steps=30000`.

	## Final Training Log Snapshot

	The final logged training metrics near completion were:

	- `step=13760/13761`
	- `epoch=1.00`
	- `loss=0.009`
	- `grad_norm=0.259`
	- `lr=2.5e-06`
	- `updt_s=1.658`
	- `data_s=0.017`

	Training completed successfully on `2026-05-13 18:37:47 UTC`.

	## Files

	This repository includes only the inference/evaluation policy files from `pretrained_model`:

	- `config.json`
	- `model.safetensors`
	- `train_config.json`
	- `policy_preprocessor.json`
	- `policy_preprocessor_step_2_normalizer_processor.safetensors`
	- `policy_postprocessor.json`
	- `policy_postprocessor_step_0_unnormalizer_processor.safetensors`

	Optimizer state and other resumable training-state files are intentionally excluded.

	## Evaluation Status

	No rollout or task-success evaluation metrics are included yet. This checkpoint is intended as a reproducible 1-epoch Pi0.5 fine-tuning artifact for IsaacLab SO-101 multi-task experiments.

	## Reproducibility

	Training was launched from the AutoDataCollector LeRobot workspace using the Pi0.5 IsaacLab training script configuration corresponding to:

	```bash
	DATASET_REPO_ID=CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi POLICY_PATH=lerobot/pi05_base BATCH_SIZE=16 GRADIENT_ACCUMULATION_STEPS=8 NUM_GPUS=2 STEPS=13761 MIXED_PRECISION=bf16 POLICY_DTYPE=bfloat16 CHUNK_SIZE=16 N_ACTION_STEPS=16 GRADIENT_CHECKPOINTING=true FREEZE_VISION_ENCODER=false TRAIN_EXPERT_ONLY=false NUM_WORKERS=8 DATALOADER_PREFETCH_FACTOR=2 DATALOADER_PERSISTENT_WORKERS=true DATALOADER_PIN_MEMORY=true PREPROCESS_IN_WORKERS=true OPTIMIZER_LR=2.5e-5 OPTIMIZER_WEIGHT_DECAY=0.01 OPTIMIZER_GRAD_CLIP_NORM=1.0 SCHEDULER_WARMUP_STEPS=1000 SCHEDULER_DECAY_STEPS=30000 SCHEDULER_DECAY_LR=2.5e-6 ./lerobot/scripts/train_pi05_isaaclab.sh
	```