Upload folder using huggingface_hub

a3f8543 verified about 2 months ago

2.33 kB

license: apache-2.0
library_name: lerobot
base_model: lerobot/smolvla_base
pipeline_tag: robotics
tags:
  - lerobot
  - smolvla
  - robotics
  - ur7e
  - imitation-learning
  - code-as-policies
  - CoRL2026
datasets:
  - CoRL2026-CSI/UR7e-CaP-Stack_Block-100epi_10fps_state_tplus1_action

smolVLA · UR7e · Stack_Block (50 epoch, tp1)

lerobot/smolvla_base 를 CoRL2026-CSI/UR7e-CaP-Stack_Block-100epi_10fps_state_tplus1_action 데이터셋으로 50 epoch 파인튜닝한 SmolVLA 정책 모델.

Model details

Base model: lerobot/smolvla_base (SmolVLM2-500M-Video-Instruct + action expert)
Robot: UR7e (7-DOF, gripper 포함)
Cameras: realsense_topview, realsense_wrist (480×640 → 256×256 resize)
Action: 7D joint positions (6 joints + gripper)
State variant: state_tplus1_action (state at t+1, action at t)

Training

Config	Value
Dataset	CoRL2026-CSI/UR7e-CaP-Stack_Block-100epi_10fps_state_tplus1_action (69932 frames, 100 episodes)
Steps	13700 (= 50 epoch)
Global batch	256 (BATCH=64 × NUM_GPUS=4)
Optimizer	AdamW (lerobot smolvla preset)
Mixed precision	no (bf16 inference)
Image augmentation	brightness / contrast / saturation / hue / sharpness / affine, max 3 random
Hardware	4× H100 80GB

학습 스크립트: scripts/ur7e_tplus1/train_smolvla_stack_block.sh (CoRL2026 lerobot fork).

Camera rename

LeRobot dataset 의 카메라 키와 SmolVLA 정책 키 매핑:

Dataset key	Policy key
`observation.images.realsense_wrist`	`observation.images.camera1`
`observation.images.realsense_topview`	`observation.images.camera2`

Usage

from lerobot.policies.smolvla.modeling_smolvla import SmolVLAPolicy

policy = SmolVLAPolicy.from_pretrained("CoRL2026-CSI/smolVLA_UR7e_Stack_Block_50epoch_tp1")

Citation / Acknowledgement

Built on top of LeRobot and the SmolVLA checkpoint. Project: CoRL 2026 CSI submission.