Robotics
LeRobot
Safetensors
smolvla
so101
imitation-learning
isaaclab
sim
multi-task
code-as-policies
CoRL2026

smolVLA · IsaacLab SO101 Multi-Task (11 tasks, 8 epoch)

lerobot/smolvla_base 를 IsaacLab 시뮬레이션 SO101 11-task 멀티태스크 데이터셋 CoRL2026-CSI/Isaaclab-so101_11task_baseCaP_3300epi_10fps 으로 8 epoch 파인튜닝한 SmolVLA 정책.

이 체크포인트는 full model (model.safetensors) 입니다 — LoRA adapter 가 아니며, 그대로 로드해 사용합니다.

Model details

  • Base model: lerobot/smolvla_base (SmolVLM2-500M-Video-Instruct VLM + action expert)
  • Robot: SO101 (6-DOF, gripper 포함) — IsaacLab 시뮬레이션
  • Cameras: top, left_wrist (480×640) — 정책 키 camera1(left_wrist) / camera2(top) 로 rename
  • Inputs: observation.state[6] + 카메라 2개 + language instruction (task)
  • Output: action[6] (joint position)
  • Action chunking: chunk_size=50, n_action_steps=50

학습 방식

VLM frozen + action expert only — SmolVLA 공식 표준 학습 방식 (SmolVLA paper, arXiv:2506.01844).

구성요소 상태
VLM backbone (SmolVLM2) ❄️ 완전 Frozen (freeze_vision_encoder=true)
Action expert 🔥 학습 (train_expert_only=true)
PEFT / LoRA 사용 안 함

Training hyperparameters

항목
Dataset Isaaclab-so101_11task_baseCaP_3300epi_10fps — 3,300 episodes / 1,175,352 frames / 11 tasks / 10 fps
Epochs / Steps 8 epoch / 36,800 steps
Global batch size 256 (micro batch 128 × 2 GPU)
Optimizer AdamW — lr 1e-4, weight_decay 1e-10, grad_clip_norm 10.0
LR scheduler cosine_decay_with_warmup — warmup 1,000 / decay 30,000 / peak_lr 1e-4 / decay_lr 2.5e-6
chunk_size / n_action_steps 50 / 50
Seed 1000
Dataloader workers 16
Mixed precision no (bf16 inference)
Image augmentation ColorJitter (brightness/contrast/saturation/hue) + SharpnessJitter — 기하학적 변형(회전/이동/반전) 없음 (VLA 좌우 의미 보존)
Hardware 2 × NVIDIA H100 80GB
Final loss 0.020

Camera rename

LeRobot dataset 의 카메라 키와 SmolVLA 정책 키 매핑:

Dataset key Policy key
observation.images.left_wrist observation.images.camera1
observation.images.top observation.images.camera2

추론·평가 시 반드시 위와 동일한 rename 을 적용해야 합니다 (학습-추론 일관성).

Input / Output 규정

  • Input: observation.state[6] (joint position) + 카메라 2개 + language instruction(task) 만
  • Output: action[6] (joint position) 만
  • 데이터셋의 ee_pos / gripper_binary / state.radian_urdf0 / action.radian_urdf0 는 학습에서 제외
  • SmolVLA 정책은 카메라 슬롯이 3개(camera1/2/3)로 고정이라 camera3 슬롯이 config 에 존재하지만, 데이터셋 카메라는 2개뿐이라 실제로 데이터가 흐르는 카메라는 2개입니다.

Usage

from lerobot.policies.smolvla.modeling_smolvla import SmolVLAPolicy

policy = SmolVLAPolicy.from_pretrained("CoRL2026-CSI/smolVLA-IsaacLab-Multi-Task-8epoch-mod")

Citation / Acknowledgement

Built on top of LeRobot and the SmolVLA base checkpoint. Project: CoRL 2026 CSI submission.

Framework versions

  • LeRobot 0.5.2
Downloads last month
1
Safetensors
Model size
0.5B params
Tensor type
F32
·
BF16
·
Video Preview
loading

Model tree for CoRL2026-CSI/smolVLA-IsaacLab-Multi-Task-8epoch-mod

Finetuned
(5570)
this model

Dataset used to train CoRL2026-CSI/smolVLA-IsaacLab-Multi-Task-8epoch-mod

Paper for CoRL2026-CSI/smolVLA-IsaacLab-Multi-Task-8epoch-mod