Robotics
LeRobot
Safetensors
English
vision-language-action
imitation-learning
behavior-cloning
pi05
pi0.5
openpi
isaaclab
so101
multi-task
corl2026
bfloat16
full-finetune
Instructions to use CoRL2026-CSI/Pi0.5-IsaacLab-Multi-Task-1epochs with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- LeRobot
How to use CoRL2026-CSI/Pi0.5-IsaacLab-Multi-Task-1epochs with LeRobot:
- Notebooks
- Google Colab
- Kaggle
| language: | |
| - en | |
| library_name: lerobot | |
| license: gemma | |
| pipeline_tag: robotics | |
| tags: | |
| - vision-language-action | |
| - imitation-learning | |
| - behavior-cloning | |
| - lerobot | |
| - pi05 | |
| - pi0.5 | |
| - openpi | |
| - robotics | |
| - isaaclab | |
| - so101 | |
| - multi-task | |
| - corl2026 | |
| - bfloat16 | |
| - full-finetune | |
| - safetensors | |
| datasets: | |
| - CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi | |
| base_model: | |
| - lerobot/pi05_base | |
| inference: false | |
| # Pi0.5 IsaacLab Multi-Task 1 Epoch | |
| This repository contains a Pi0.5 policy fine-tuned with LeRobot on the IsaacLab SO-101 multi-task dataset `CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi`. | |
| ## Model Details | |
| - **Base model:** `lerobot/pi05_base` | |
| - **Policy type:** `pi05` | |
| - **Training type:** full fine-tuning | |
| - **Vision encoder frozen:** no | |
| - **Action expert only:** no | |
| - **Checkpoint:** final checkpoint at step `13761` | |
| - **Training length:** `1.00` epoch | |
| - **Precision:** bfloat16 | |
| - **Format:** safetensors | |
| ## Dataset | |
| - **Dataset:** `CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi` | |
| - **Robot:** SO-101 follower | |
| - **Episodes:** `3300` | |
| - **Frames:** `3,522,774` | |
| - **Tasks:** `800` | |
| - **FPS:** `30` | |
| - **Visual inputs:** `observation.images.top`, `observation.images.left_wrist` | |
| - **State/action dimensions:** 6 DoF robot state/action, padded by the Pi0.5 policy configuration as needed | |
| ## Training Hyperparameters | |
| | Setting | Value | | |
| |---|---:| | |
| | Steps | `13761` | | |
| | Epochs | `1.00` | | |
| | Per-device batch size | `16` | | |
| | GPUs | `2` | | |
| | Gradient accumulation | `8` | | |
| | Effective batch size | `256` | | |
| | Mixed precision | `bf16` | | |
| | Policy dtype | `bfloat16` | | |
| | Chunk size | `16` | | |
| | Action steps | `16` | | |
| | Gradient checkpointing | `true` | | |
| | Compile model | `false` | | |
| | DataLoader workers | `8` | | |
| | DataLoader prefetch factor | `2` | | |
| | Persistent workers | `true` | | |
| | Pin memory | `true` | | |
| | Preprocess in workers | `true` | | |
| | DDP find unused parameters | `true` | | |
| | Seed | `1000` | | |
| ### Optimizer and Scheduler | |
| | Setting | Value | | |
| |---|---:| | |
| | Optimizer | AdamW | | |
| | Learning rate | `2.5e-5` | | |
| | Betas | `[0.9, 0.95]` | | |
| | Epsilon | `1e-8` | | |
| | Weight decay | `0.01` | | |
| | Gradient clip norm | `1.0` | | |
| | Scheduler | cosine decay with warmup | | |
| | Configured warmup steps | `1000` | | |
| | Effective warmup steps | `458` | | |
| | Configured decay steps | `30000` | | |
| | Effective decay steps | `13761` | | |
| | Final decay LR | `2.5e-6` | | |
| The scheduler was automatically scaled because `num_training_steps=13761` was smaller than the configured `num_decay_steps=30000`. | |
| ## Final Training Log Snapshot | |
| The final logged training metrics near completion were: | |
| - `step=13760/13761` | |
| - `epoch=1.00` | |
| - `loss=0.009` | |
| - `grad_norm=0.259` | |
| - `lr=2.5e-06` | |
| - `updt_s=1.658` | |
| - `data_s=0.017` | |
| Training completed successfully on `2026-05-13 18:37:47 UTC`. | |
| ## Files | |
| This repository includes only the inference/evaluation policy files from `pretrained_model`: | |
| - `config.json` | |
| - `model.safetensors` | |
| - `train_config.json` | |
| - `policy_preprocessor.json` | |
| - `policy_preprocessor_step_2_normalizer_processor.safetensors` | |
| - `policy_postprocessor.json` | |
| - `policy_postprocessor_step_0_unnormalizer_processor.safetensors` | |
| Optimizer state and other resumable training-state files are intentionally excluded. | |
| ## Evaluation Status | |
| No rollout or task-success evaluation metrics are included yet. This checkpoint is intended as a reproducible 1-epoch Pi0.5 fine-tuning artifact for IsaacLab SO-101 multi-task experiments. | |
| ## Reproducibility | |
| Training was launched from the AutoDataCollector LeRobot workspace using the Pi0.5 IsaacLab training script configuration corresponding to: | |
| ```bash | |
| DATASET_REPO_ID=CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi POLICY_PATH=lerobot/pi05_base BATCH_SIZE=16 GRADIENT_ACCUMULATION_STEPS=8 NUM_GPUS=2 STEPS=13761 MIXED_PRECISION=bf16 POLICY_DTYPE=bfloat16 CHUNK_SIZE=16 N_ACTION_STEPS=16 GRADIENT_CHECKPOINTING=true FREEZE_VISION_ENCODER=false TRAIN_EXPERT_ONLY=false NUM_WORKERS=8 DATALOADER_PREFETCH_FACTOR=2 DATALOADER_PERSISTENT_WORKERS=true DATALOADER_PIN_MEMORY=true PREPROCESS_IN_WORKERS=true OPTIMIZER_LR=2.5e-5 OPTIMIZER_WEIGHT_DECAY=0.01 OPTIMIZER_GRAD_CLIP_NORM=1.0 SCHEDULER_WARMUP_STEPS=1000 SCHEDULER_DECAY_STEPS=30000 SCHEDULER_DECAY_LR=2.5e-6 ./lerobot/scripts/train_pi05_isaaclab.sh | |
| ``` | |