Robotics
LeRobot
Safetensors
English
vision-language-action
imitation-learning
behavior-cloning
pi05
pi0.5
openpi
isaaclab
so101
multi-task
corl2026
bfloat16
full-finetune
Instructions to use CoRL2026-CSI/Pi0.5-IsaacLab-Multi-Task-1epochs with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- LeRobot
How to use CoRL2026-CSI/Pi0.5-IsaacLab-Multi-Task-1epochs with LeRobot:
- Notebooks
- Google Colab
- Kaggle
Add model card
Browse files
README.md
ADDED
|
@@ -0,0 +1,138 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language:
|
| 3 |
+
- en
|
| 4 |
+
library_name: lerobot
|
| 5 |
+
license: gemma
|
| 6 |
+
pipeline_tag: robotics
|
| 7 |
+
tags:
|
| 8 |
+
- vision-language-action
|
| 9 |
+
- imitation-learning
|
| 10 |
+
- behavior-cloning
|
| 11 |
+
- lerobot
|
| 12 |
+
- pi05
|
| 13 |
+
- pi0.5
|
| 14 |
+
- openpi
|
| 15 |
+
- robotics
|
| 16 |
+
- isaaclab
|
| 17 |
+
- so101
|
| 18 |
+
- multi-task
|
| 19 |
+
- corl2026
|
| 20 |
+
- bfloat16
|
| 21 |
+
- full-finetune
|
| 22 |
+
- safetensors
|
| 23 |
+
datasets:
|
| 24 |
+
- CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi
|
| 25 |
+
base_model:
|
| 26 |
+
- lerobot/pi05_base
|
| 27 |
+
inference: false
|
| 28 |
+
---
|
| 29 |
+
|
| 30 |
+
# Pi0.5 IsaacLab Multi-Task 1 Epoch
|
| 31 |
+
|
| 32 |
+
This repository contains a Pi0.5 policy fine-tuned with LeRobot on the IsaacLab SO-101 multi-task dataset `CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi`.
|
| 33 |
+
|
| 34 |
+
## Model Details
|
| 35 |
+
|
| 36 |
+
- **Base model:** `lerobot/pi05_base`
|
| 37 |
+
- **Policy type:** `pi05`
|
| 38 |
+
- **Training type:** full fine-tuning
|
| 39 |
+
- **Vision encoder frozen:** no
|
| 40 |
+
- **Action expert only:** no
|
| 41 |
+
- **Checkpoint:** final checkpoint at step `13761`
|
| 42 |
+
- **Training length:** `1.00` epoch
|
| 43 |
+
- **Precision:** bfloat16
|
| 44 |
+
- **Format:** safetensors
|
| 45 |
+
|
| 46 |
+
## Dataset
|
| 47 |
+
|
| 48 |
+
- **Dataset:** `CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi`
|
| 49 |
+
- **Robot:** SO-101 follower
|
| 50 |
+
- **Episodes:** `3300`
|
| 51 |
+
- **Frames:** `3,522,774`
|
| 52 |
+
- **Tasks:** `800`
|
| 53 |
+
- **FPS:** `30`
|
| 54 |
+
- **Visual inputs:** `observation.images.top`, `observation.images.left_wrist`
|
| 55 |
+
- **State/action dimensions:** 6 DoF robot state/action, padded by the Pi0.5 policy configuration as needed
|
| 56 |
+
|
| 57 |
+
## Training Hyperparameters
|
| 58 |
+
|
| 59 |
+
| Setting | Value |
|
| 60 |
+
|---|---:|
|
| 61 |
+
| Steps | `13761` |
|
| 62 |
+
| Epochs | `1.00` |
|
| 63 |
+
| Per-device batch size | `16` |
|
| 64 |
+
| GPUs | `2` |
|
| 65 |
+
| Gradient accumulation | `8` |
|
| 66 |
+
| Effective batch size | `256` |
|
| 67 |
+
| Mixed precision | `bf16` |
|
| 68 |
+
| Policy dtype | `bfloat16` |
|
| 69 |
+
| Chunk size | `16` |
|
| 70 |
+
| Action steps | `16` |
|
| 71 |
+
| Gradient checkpointing | `true` |
|
| 72 |
+
| Compile model | `false` |
|
| 73 |
+
| DataLoader workers | `8` |
|
| 74 |
+
| DataLoader prefetch factor | `2` |
|
| 75 |
+
| Persistent workers | `true` |
|
| 76 |
+
| Pin memory | `true` |
|
| 77 |
+
| Preprocess in workers | `true` |
|
| 78 |
+
| DDP find unused parameters | `true` |
|
| 79 |
+
| Seed | `1000` |
|
| 80 |
+
|
| 81 |
+
### Optimizer and Scheduler
|
| 82 |
+
|
| 83 |
+
| Setting | Value |
|
| 84 |
+
|---|---:|
|
| 85 |
+
| Optimizer | AdamW |
|
| 86 |
+
| Learning rate | `2.5e-5` |
|
| 87 |
+
| Betas | `[0.9, 0.95]` |
|
| 88 |
+
| Epsilon | `1e-8` |
|
| 89 |
+
| Weight decay | `0.01` |
|
| 90 |
+
| Gradient clip norm | `1.0` |
|
| 91 |
+
| Scheduler | cosine decay with warmup |
|
| 92 |
+
| Configured warmup steps | `1000` |
|
| 93 |
+
| Effective warmup steps | `458` |
|
| 94 |
+
| Configured decay steps | `30000` |
|
| 95 |
+
| Effective decay steps | `13761` |
|
| 96 |
+
| Final decay LR | `2.5e-6` |
|
| 97 |
+
|
| 98 |
+
The scheduler was automatically scaled because `num_training_steps=13761` was smaller than the configured `num_decay_steps=30000`.
|
| 99 |
+
|
| 100 |
+
## Final Training Log Snapshot
|
| 101 |
+
|
| 102 |
+
The final logged training metrics near completion were:
|
| 103 |
+
|
| 104 |
+
- `step=13760/13761`
|
| 105 |
+
- `epoch=1.00`
|
| 106 |
+
- `loss=0.009`
|
| 107 |
+
- `grad_norm=0.259`
|
| 108 |
+
- `lr=2.5e-06`
|
| 109 |
+
- `updt_s=1.658`
|
| 110 |
+
- `data_s=0.017`
|
| 111 |
+
|
| 112 |
+
Training completed successfully on `2026-05-13 18:37:47 UTC`.
|
| 113 |
+
|
| 114 |
+
## Files
|
| 115 |
+
|
| 116 |
+
This repository includes only the inference/evaluation policy files from `pretrained_model`:
|
| 117 |
+
|
| 118 |
+
- `config.json`
|
| 119 |
+
- `model.safetensors`
|
| 120 |
+
- `train_config.json`
|
| 121 |
+
- `policy_preprocessor.json`
|
| 122 |
+
- `policy_preprocessor_step_2_normalizer_processor.safetensors`
|
| 123 |
+
- `policy_postprocessor.json`
|
| 124 |
+
- `policy_postprocessor_step_0_unnormalizer_processor.safetensors`
|
| 125 |
+
|
| 126 |
+
Optimizer state and other resumable training-state files are intentionally excluded.
|
| 127 |
+
|
| 128 |
+
## Evaluation Status
|
| 129 |
+
|
| 130 |
+
No rollout or task-success evaluation metrics are included yet. This checkpoint is intended as a reproducible 1-epoch Pi0.5 fine-tuning artifact for IsaacLab SO-101 multi-task experiments.
|
| 131 |
+
|
| 132 |
+
## Reproducibility
|
| 133 |
+
|
| 134 |
+
Training was launched from the AutoDataCollector LeRobot workspace using the Pi0.5 IsaacLab training script configuration corresponding to:
|
| 135 |
+
|
| 136 |
+
```bash
|
| 137 |
+
DATASET_REPO_ID=CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi POLICY_PATH=lerobot/pi05_base BATCH_SIZE=16 GRADIENT_ACCUMULATION_STEPS=8 NUM_GPUS=2 STEPS=13761 MIXED_PRECISION=bf16 POLICY_DTYPE=bfloat16 CHUNK_SIZE=16 N_ACTION_STEPS=16 GRADIENT_CHECKPOINTING=true FREEZE_VISION_ENCODER=false TRAIN_EXPERT_ONLY=false NUM_WORKERS=8 DATALOADER_PREFETCH_FACTOR=2 DATALOADER_PERSISTENT_WORKERS=true DATALOADER_PIN_MEMORY=true PREPROCESS_IN_WORKERS=true OPTIMIZER_LR=2.5e-5 OPTIMIZER_WEIGHT_DECAY=0.01 OPTIMIZER_GRAD_CLIP_NORM=1.0 SCHEDULER_WARMUP_STEPS=1000 SCHEDULER_DECAY_STEPS=30000 SCHEDULER_DECAY_LR=2.5e-6 ./lerobot/scripts/train_pi05_isaaclab.sh
|
| 138 |
+
```
|