--- language: - en library_name: lerobot license: gemma pipeline_tag: robotics tags: - vision-language-action - imitation-learning - behavior-cloning - lerobot - pi05 - pi0.5 - openpi - robotics - isaaclab - so101 - multi-task - corl2026 - bfloat16 - full-finetune - safetensors datasets: - CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi base_model: - lerobot/pi05_base inference: false --- # Pi0.5 IsaacLab Multi-Task 1 Epoch This repository contains a Pi0.5 policy fine-tuned with LeRobot on the IsaacLab SO-101 multi-task dataset `CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi`. ## Model Details - **Base model:** `lerobot/pi05_base` - **Policy type:** `pi05` - **Training type:** full fine-tuning - **Vision encoder frozen:** no - **Action expert only:** no - **Checkpoint:** final checkpoint at step `13761` - **Training length:** `1.00` epoch - **Precision:** bfloat16 - **Format:** safetensors ## Dataset - **Dataset:** `CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi` - **Robot:** SO-101 follower - **Episodes:** `3300` - **Frames:** `3,522,774` - **Tasks:** `800` - **FPS:** `30` - **Visual inputs:** `observation.images.top`, `observation.images.left_wrist` - **State/action dimensions:** 6 DoF robot state/action, padded by the Pi0.5 policy configuration as needed ## Training Hyperparameters | Setting | Value | |---|---:| | Steps | `13761` | | Epochs | `1.00` | | Per-device batch size | `16` | | GPUs | `2` | | Gradient accumulation | `8` | | Effective batch size | `256` | | Mixed precision | `bf16` | | Policy dtype | `bfloat16` | | Chunk size | `16` | | Action steps | `16` | | Gradient checkpointing | `true` | | Compile model | `false` | | DataLoader workers | `8` | | DataLoader prefetch factor | `2` | | Persistent workers | `true` | | Pin memory | `true` | | Preprocess in workers | `true` | | DDP find unused parameters | `true` | | Seed | `1000` | ### Optimizer and Scheduler | Setting | Value | |---|---:| | Optimizer | AdamW | | Learning rate | `2.5e-5` | | Betas | `[0.9, 0.95]` | | Epsilon | `1e-8` | | Weight decay | `0.01` | | Gradient clip norm | `1.0` | | Scheduler | cosine decay with warmup | | Configured warmup steps | `1000` | | Effective warmup steps | `458` | | Configured decay steps | `30000` | | Effective decay steps | `13761` | | Final decay LR | `2.5e-6` | The scheduler was automatically scaled because `num_training_steps=13761` was smaller than the configured `num_decay_steps=30000`. ## Final Training Log Snapshot The final logged training metrics near completion were: - `step=13760/13761` - `epoch=1.00` - `loss=0.009` - `grad_norm=0.259` - `lr=2.5e-06` - `updt_s=1.658` - `data_s=0.017` Training completed successfully on `2026-05-13 18:37:47 UTC`. ## Files This repository includes only the inference/evaluation policy files from `pretrained_model`: - `config.json` - `model.safetensors` - `train_config.json` - `policy_preprocessor.json` - `policy_preprocessor_step_2_normalizer_processor.safetensors` - `policy_postprocessor.json` - `policy_postprocessor_step_0_unnormalizer_processor.safetensors` Optimizer state and other resumable training-state files are intentionally excluded. ## Evaluation Status No rollout or task-success evaluation metrics are included yet. This checkpoint is intended as a reproducible 1-epoch Pi0.5 fine-tuning artifact for IsaacLab SO-101 multi-task experiments. ## Reproducibility Training was launched from the AutoDataCollector LeRobot workspace using the Pi0.5 IsaacLab training script configuration corresponding to: ```bash DATASET_REPO_ID=CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi POLICY_PATH=lerobot/pi05_base BATCH_SIZE=16 GRADIENT_ACCUMULATION_STEPS=8 NUM_GPUS=2 STEPS=13761 MIXED_PRECISION=bf16 POLICY_DTYPE=bfloat16 CHUNK_SIZE=16 N_ACTION_STEPS=16 GRADIENT_CHECKPOINTING=true FREEZE_VISION_ENCODER=false TRAIN_EXPERT_ONLY=false NUM_WORKERS=8 DATALOADER_PREFETCH_FACTOR=2 DATALOADER_PERSISTENT_WORKERS=true DATALOADER_PIN_MEMORY=true PREPROCESS_IN_WORKERS=true OPTIMIZER_LR=2.5e-5 OPTIMIZER_WEIGHT_DECAY=0.01 OPTIMIZER_GRAD_CLIP_NORM=1.0 SCHEDULER_WARMUP_STEPS=1000 SCHEDULER_DECAY_STEPS=30000 SCHEDULER_DECAY_LR=2.5e-6 ./lerobot/scripts/train_pi05_isaaclab.sh ```