Robotics
LeRobot
Safetensors
English
vision-language-action
imitation-learning
behavior-cloning
pi05
pi0.5
openpi
isaaclab
so101
multi-task
corl2026
bfloat16
full-finetune
Instructions to use CoRL2026-CSI/Pi0.5-IsaacLab-Multi-Task-1epochs with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- LeRobot
How to use CoRL2026-CSI/Pi0.5-IsaacLab-Multi-Task-1epochs with LeRobot:
- Notebooks
- Google Colab
- Kaggle
File size: 4,208 Bytes
9c83b24 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 | ---
language:
- en
library_name: lerobot
license: gemma
pipeline_tag: robotics
tags:
- vision-language-action
- imitation-learning
- behavior-cloning
- lerobot
- pi05
- pi0.5
- openpi
- robotics
- isaaclab
- so101
- multi-task
- corl2026
- bfloat16
- full-finetune
- safetensors
datasets:
- CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi
base_model:
- lerobot/pi05_base
inference: false
---
# Pi0.5 IsaacLab Multi-Task 1 Epoch
This repository contains a Pi0.5 policy fine-tuned with LeRobot on the IsaacLab SO-101 multi-task dataset `CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi`.
## Model Details
- **Base model:** `lerobot/pi05_base`
- **Policy type:** `pi05`
- **Training type:** full fine-tuning
- **Vision encoder frozen:** no
- **Action expert only:** no
- **Checkpoint:** final checkpoint at step `13761`
- **Training length:** `1.00` epoch
- **Precision:** bfloat16
- **Format:** safetensors
## Dataset
- **Dataset:** `CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi`
- **Robot:** SO-101 follower
- **Episodes:** `3300`
- **Frames:** `3,522,774`
- **Tasks:** `800`
- **FPS:** `30`
- **Visual inputs:** `observation.images.top`, `observation.images.left_wrist`
- **State/action dimensions:** 6 DoF robot state/action, padded by the Pi0.5 policy configuration as needed
## Training Hyperparameters
| Setting | Value |
|---|---:|
| Steps | `13761` |
| Epochs | `1.00` |
| Per-device batch size | `16` |
| GPUs | `2` |
| Gradient accumulation | `8` |
| Effective batch size | `256` |
| Mixed precision | `bf16` |
| Policy dtype | `bfloat16` |
| Chunk size | `16` |
| Action steps | `16` |
| Gradient checkpointing | `true` |
| Compile model | `false` |
| DataLoader workers | `8` |
| DataLoader prefetch factor | `2` |
| Persistent workers | `true` |
| Pin memory | `true` |
| Preprocess in workers | `true` |
| DDP find unused parameters | `true` |
| Seed | `1000` |
### Optimizer and Scheduler
| Setting | Value |
|---|---:|
| Optimizer | AdamW |
| Learning rate | `2.5e-5` |
| Betas | `[0.9, 0.95]` |
| Epsilon | `1e-8` |
| Weight decay | `0.01` |
| Gradient clip norm | `1.0` |
| Scheduler | cosine decay with warmup |
| Configured warmup steps | `1000` |
| Effective warmup steps | `458` |
| Configured decay steps | `30000` |
| Effective decay steps | `13761` |
| Final decay LR | `2.5e-6` |
The scheduler was automatically scaled because `num_training_steps=13761` was smaller than the configured `num_decay_steps=30000`.
## Final Training Log Snapshot
The final logged training metrics near completion were:
- `step=13760/13761`
- `epoch=1.00`
- `loss=0.009`
- `grad_norm=0.259`
- `lr=2.5e-06`
- `updt_s=1.658`
- `data_s=0.017`
Training completed successfully on `2026-05-13 18:37:47 UTC`.
## Files
This repository includes only the inference/evaluation policy files from `pretrained_model`:
- `config.json`
- `model.safetensors`
- `train_config.json`
- `policy_preprocessor.json`
- `policy_preprocessor_step_2_normalizer_processor.safetensors`
- `policy_postprocessor.json`
- `policy_postprocessor_step_0_unnormalizer_processor.safetensors`
Optimizer state and other resumable training-state files are intentionally excluded.
## Evaluation Status
No rollout or task-success evaluation metrics are included yet. This checkpoint is intended as a reproducible 1-epoch Pi0.5 fine-tuning artifact for IsaacLab SO-101 multi-task experiments.
## Reproducibility
Training was launched from the AutoDataCollector LeRobot workspace using the Pi0.5 IsaacLab training script configuration corresponding to:
```bash
DATASET_REPO_ID=CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi POLICY_PATH=lerobot/pi05_base BATCH_SIZE=16 GRADIENT_ACCUMULATION_STEPS=8 NUM_GPUS=2 STEPS=13761 MIXED_PRECISION=bf16 POLICY_DTYPE=bfloat16 CHUNK_SIZE=16 N_ACTION_STEPS=16 GRADIENT_CHECKPOINTING=true FREEZE_VISION_ENCODER=false TRAIN_EXPERT_ONLY=false NUM_WORKERS=8 DATALOADER_PREFETCH_FACTOR=2 DATALOADER_PERSISTENT_WORKERS=true DATALOADER_PIN_MEMORY=true PREPROCESS_IN_WORKERS=true OPTIMIZER_LR=2.5e-5 OPTIMIZER_WEIGHT_DECAY=0.01 OPTIMIZER_GRAD_CLIP_NORM=1.0 SCHEDULER_WARMUP_STEPS=1000 SCHEDULER_DECAY_STEPS=30000 SCHEDULER_DECAY_LR=2.5e-6 ./lerobot/scripts/train_pi05_isaaclab.sh
```
|