SmolVLA-CaP-SortBlock-50epochs

This repository contains a SmolVLA policy fine-tuned with LeRobot for the SO101 CAP task Sort RGB Blocks to Matching Plates. The policy was initialized from CoRL2026-CSI/smolvla_isaaclab_so101_11task_basecap_3300epi_8ep and trained for 50 epochs on CoRL2026-CSI/SO101-cap_sort_RGBblock_to_matchingplate_10fps.

Model Details

Field Value
Policy type smolvla
Task sort red, green, and blue blocks onto plates of the matching color
Robot SO101 follower
Dataset CoRL2026-CSI/SO101-cap_sort_RGBblock_to_matchingplate_10fps
Base model CoRL2026-CSI/smolvla_isaaclab_so101_11task_basecap_3300epi_8ep
Training steps 17100
Completed step 17100
Batch size 128 per GPU
Effective batch size 256
Action chunk size 50
Action horizon 50
Observation steps 1
Inference denoising steps 50
Model weights model.safetensors (864.7 MiB)

Training Setup

The run used two CUDA processes with batch_size=128 per process, image augmentation enabled, and camera key remapping from the dataset's raw cameras to the SmolVLA camera names:

observation.images.left_wrist -> observation.images.camera1
observation.images.top        -> observation.images.camera2

The checkpoint was saved locally at step 17100 with LeRobot's preprocessor and postprocessor artifacts included in this repository.

Files

model.safetensors
config.json
train_config.json
policy_preprocessor.json
policy_preprocessor_step_5_normalizer_processor.safetensors
policy_postprocessor.json
policy_postprocessor_step_0_unnormalizer_processor.safetensors

Usage

from lerobot.policies.smolvla.modeling_smolvla import SmolVLAPolicy

policy = SmolVLAPolicy.from_pretrained("CoRL2026-CSI/SmolVLA-CaP-SortBlock-50epochs")

For robot deployment, use the same camera mapping, normalization pipeline, and SO101 action/state conventions used by the training dataset.

Intended Use

This model is intended for imitation-learning experiments and SO101 tabletop manipulation research on the specified CAP task. It is not a general-purpose robot policy and should be validated in a controlled workspace before any hardware deployment.

Limitations

The model was trained on a single task dataset with fixed camera views, object set, action space, and workspace assumptions. No official evaluation success rate is included in this repository.

Downloads last month
12
Safetensors
Model size
0.5B params
Tensor type
F32
·
BF16
·
Video Preview
loading

Model tree for CoRL2026-CSI/SmolVLA-CaP-SortBlock-50epochs

Dataset used to train CoRL2026-CSI/SmolVLA-CaP-SortBlock-50epochs