Cache-SCA
/

smolVLA_UR7e_Stack_Block_50epoch_tp1

imitation-learning

code-as-policies

Model card Files Files and versions

smolVLA_UR7e_Stack_Block_50epoch_tp1 / README.md

HyeonseokE's picture

Upload folder using huggingface_hub

a3f8543 verified about 2 months ago

|

History Blame Contribute Delete

2.33 kB

	---
	license: apache-2.0
	library_name: lerobot
	base_model: lerobot/smolvla_base
	pipeline_tag: robotics
	tags:
	- lerobot
	- smolvla
	- robotics
	- ur7e
	- imitation-learning
	- code-as-policies
	- CoRL2026
	datasets:
	- CoRL2026-CSI/UR7e-CaP-Stack_Block-100epi_10fps_state_tplus1_action
	---

	# smolVLA · UR7e · Stack_Block (50 epoch, tp1)

	[lerobot/smolvla_base](https://huggingface.co/lerobot/smolvla_base) 를 [CoRL2026-CSI/UR7e-CaP-Stack_Block-100epi_10fps_state_tplus1_action](https://huggingface.co/datasets/CoRL2026-CSI/UR7e-CaP-Stack_Block-100epi_10fps_state_tplus1_action) 데이터셋으로 50 epoch 파인튜닝한 SmolVLA 정책 모델.

	## Model details

	- Base model: `lerobot/smolvla_base` (SmolVLM2-500M-Video-Instruct + action expert)
	- Robot: UR7e (7-DOF, gripper 포함)
	- Cameras: `realsense_topview`, `realsense_wrist` (480×640 → 256×256 resize)
	- Action: 7D joint positions (6 joints + gripper)
	- State variant: `state_tplus1_action` (state at t+1, action at t)

	## Training

	\| Config \| Value \|
	\|---\|---\|
	\| Dataset \| [CoRL2026-CSI/UR7e-CaP-Stack_Block-100epi_10fps_state_tplus1_action](https://huggingface.co/datasets/CoRL2026-CSI/UR7e-CaP-Stack_Block-100epi_10fps_state_tplus1_action) (69932 frames, 100 episodes) \|
	\| Steps \| 13700 (= 50 epoch) \|
	\| Global batch \| 256 (BATCH=64 × NUM_GPUS=4) \|
	\| Optimizer \| AdamW (lerobot smolvla preset) \|
	\| Mixed precision \| no (bf16 inference) \|
	\| Image augmentation \| brightness / contrast / saturation / hue / sharpness / affine, max 3 random \|
	\| Hardware \| 4× H100 80GB \|

	학습 스크립트: `scripts/ur7e_tplus1/train_smolvla_stack_block.sh` (CoRL2026 lerobot fork).

	## Camera rename

	LeRobot dataset 의 카메라 키와 SmolVLA 정책 키 매핑:

	\| Dataset key \| Policy key \|
	\|---\|---\|
	\| `observation.images.realsense_wrist` \| `observation.images.camera1` \|
	\| `observation.images.realsense_topview` \| `observation.images.camera2` \|

	## Usage

	```python
	from lerobot.policies.smolvla.modeling_smolvla import SmolVLAPolicy

	policy = SmolVLAPolicy.from_pretrained("CoRL2026-CSI/smolVLA_UR7e_Stack_Block_50epoch_tp1")
	```

	## Citation / Acknowledgement

	Built on top of [LeRobot](https://github.com/huggingface/lerobot) and the [SmolVLA](https://huggingface.co/lerobot/smolvla_base) checkpoint. Project: CoRL 2026 CSI submission.