CoRL2026-CSI
/

Pi0.5-IsaacLab-Multi-Task-1epochs

+---
+language:
+- en
+library_name: lerobot
+license: gemma
+pipeline_tag: robotics
+tags:
+- vision-language-action
+- imitation-learning
+- behavior-cloning
+- lerobot
+- pi05
+- pi0.5
+- openpi
+- robotics
+- isaaclab
+- so101
+- multi-task
+- corl2026
+- bfloat16
+- full-finetune
+- safetensors
+datasets:
+- CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi
+base_model:
+- lerobot/pi05_base
+inference: false
+---
+# Pi0.5 IsaacLab Multi-Task 1 Epoch
+This repository contains a Pi0.5 policy fine-tuned with LeRobot on the IsaacLab SO-101 multi-task dataset `CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi`.
+## Model Details
+- **Base model:** `lerobot/pi05_base`
+- **Policy type:** `pi05`
+- **Training type:** full fine-tuning
+- **Vision encoder frozen:** no
+- **Action expert only:** no
+- **Checkpoint:** final checkpoint at step `13761`
+- **Training length:** `1.00` epoch
+- **Precision:** bfloat16
+- **Format:** safetensors
+## Dataset
+- **Dataset:** `CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi`
+- **Robot:** SO-101 follower
+- **Episodes:** `3300`
+- **Frames:** `3,522,774`
+- **Tasks:** `800`
+- **FPS:** `30`
+- **Visual inputs:** `observation.images.top`, `observation.images.left_wrist`
+- **State/action dimensions:** 6 DoF robot state/action, padded by the Pi0.5 policy configuration as needed
+## Training Hyperparameters
+| Setting | Value |
+|---|---:|
+| Steps | `13761` |
+| Epochs | `1.00` |
+| Per-device batch size | `16` |
+| GPUs | `2` |
+| Gradient accumulation | `8` |
+| Effective batch size | `256` |
+| Mixed precision | `bf16` |
+| Policy dtype | `bfloat16` |
+| Chunk size | `16` |
+| Action steps | `16` |
+| Gradient checkpointing | `true` |
+| Compile model | `false` |
+| DataLoader workers | `8` |
+| DataLoader prefetch factor | `2` |
+| Persistent workers | `true` |
+| Pin memory | `true` |
+| Preprocess in workers | `true` |
+| DDP find unused parameters | `true` |
+| Seed | `1000` |
+### Optimizer and Scheduler
+| Setting | Value |
+|---|---:|
+| Optimizer | AdamW |
+| Learning rate | `2.5e-5` |
+| Betas | `[0.9, 0.95]` |
+| Epsilon | `1e-8` |
+| Weight decay | `0.01` |
+| Gradient clip norm | `1.0` |
+| Scheduler | cosine decay with warmup |
+| Configured warmup steps | `1000` |
+| Effective warmup steps | `458` |
+| Configured decay steps | `30000` |
+| Effective decay steps | `13761` |
+| Final decay LR | `2.5e-6` |
+The scheduler was automatically scaled because `num_training_steps=13761` was smaller than the configured `num_decay_steps=30000`.
+## Final Training Log Snapshot
+The final logged training metrics near completion were:
+- `step=13760/13761`
+- `epoch=1.00`
+- `loss=0.009`
+- `grad_norm=0.259`
+- `lr=2.5e-06`
+- `updt_s=1.658`
+- `data_s=0.017`
+Training completed successfully on `2026-05-13 18:37:47 UTC`.
+## Files
+This repository includes only the inference/evaluation policy files from `pretrained_model`:
+- `config.json`
+- `model.safetensors`
+- `train_config.json`
+- `policy_preprocessor.json`
+- `policy_preprocessor_step_2_normalizer_processor.safetensors`
+- `policy_postprocessor.json`
+- `policy_postprocessor_step_0_unnormalizer_processor.safetensors`
+Optimizer state and other resumable training-state files are intentionally excluded.
+## Evaluation Status
+No rollout or task-success evaluation metrics are included yet. This checkpoint is intended as a reproducible 1-epoch Pi0.5 fine-tuning artifact for IsaacLab SO-101 multi-task experiments.
+## Reproducibility
+Training was launched from the AutoDataCollector LeRobot workspace using the Pi0.5 IsaacLab training script configuration corresponding to:
+```bash
+DATASET_REPO_ID=CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi POLICY_PATH=lerobot/pi05_base BATCH_SIZE=16 GRADIENT_ACCUMULATION_STEPS=8 NUM_GPUS=2 STEPS=13761 MIXED_PRECISION=bf16 POLICY_DTYPE=bfloat16 CHUNK_SIZE=16 N_ACTION_STEPS=16 GRADIENT_CHECKPOINTING=true FREEZE_VISION_ENCODER=false TRAIN_EXPERT_ONLY=false NUM_WORKERS=8 DATALOADER_PREFETCH_FACTOR=2 DATALOADER_PERSISTENT_WORKERS=true DATALOADER_PIN_MEMORY=true PREPROCESS_IN_WORKERS=true OPTIMIZER_LR=2.5e-5 OPTIMIZER_WEIGHT_DECAY=0.01 OPTIMIZER_GRAD_CLIP_NORM=1.0 SCHEDULER_WARMUP_STEPS=1000 SCHEDULER_DECAY_STEPS=30000 SCHEDULER_DECAY_LR=2.5e-6 ./lerobot/scripts/train_pi05_isaaclab.sh
+```