--- license: apache-2.0 library_name: lerobot pipeline_tag: robotics base_model: CoRL2026-CSI/smolvla_isaaclab_so101_11task_basecap_3300epi_8ep datasets: - CoRL2026-CSI/SO101-cap_stack_RGBblock_on_bluedish_10fps tags: - lerobot - robotics - smolvla - vla - so101 - code-as-policies - cap - imitation-learning - 50epochs - single-arm - dual-camera - stack-block - rgb-blocks - blue-dish --- # SmolVLA-CaP-StackBlock-50epochs This repository contains a SmolVLA policy fine-tuned with LeRobot for the SO101 CAP task **Stack RGB Blocks on a Blue Dish**. The policy was initialized from `CoRL2026-CSI/smolvla_isaaclab_so101_11task_basecap_3300epi_8ep` and trained for 50 epochs on `CoRL2026-CSI/SO101-cap_stack_RGBblock_on_bluedish_10fps`. ## Model Details | Field | Value | |---|---| | Policy type | `smolvla` | | Task | stack red, green, and blue blocks on the blue dish from bottom to top | | Robot | SO101 follower | | Dataset | `CoRL2026-CSI/SO101-cap_stack_RGBblock_on_bluedish_10fps` | | Base model | `CoRL2026-CSI/smolvla_isaaclab_so101_11task_basecap_3300epi_8ep` | | Training steps | `17100` | | Completed step | `17100` | | Batch size | `128` per GPU | | Effective batch size | `256` | | Action chunk size | `50` | | Action horizon | `50` | | Observation steps | `1` | | Inference denoising steps | `50` | | Model weights | `model.safetensors` (864.7 MiB) | ## Training Setup The run used two CUDA processes with `batch_size=128` per process, image augmentation enabled, and camera key remapping from the dataset's raw cameras to the SmolVLA camera names: ```text observation.images.left_wrist -> observation.images.camera1 observation.images.top -> observation.images.camera2 ``` The checkpoint was saved locally at step `17100` with LeRobot's preprocessor and postprocessor artifacts included in this repository. ## Files ```text model.safetensors config.json train_config.json policy_preprocessor.json policy_preprocessor_step_5_normalizer_processor.safetensors policy_postprocessor.json policy_postprocessor_step_0_unnormalizer_processor.safetensors ``` ## Usage ```python from lerobot.policies.smolvla.modeling_smolvla import SmolVLAPolicy policy = SmolVLAPolicy.from_pretrained("CoRL2026-CSI/SmolVLA-CaP-StackBlock-50epochs") ``` For robot deployment, use the same camera mapping, normalization pipeline, and SO101 action/state conventions used by the training dataset. ## Intended Use This model is intended for imitation-learning experiments and SO101 tabletop manipulation research on the specified CAP task. It is not a general-purpose robot policy and should be validated in a controlled workspace before any hardware deployment. ## Limitations The model was trained on a single task dataset with fixed camera views, object set, action space, and workspace assumptions. No official evaluation success rate is included in this repository.