ShubhamK32/so101_declutter_v1
Viewer β’ Updated β’ 23k β’ 26
How to use ShubhamK32/smolvla_so101_declutter with LeRobot:
# See https://github.com/huggingface/lerobot?tab=readme-ov-file#installation for more details git clone https://github.com/huggingface/lerobot.git cd lerobot pip install -e .[smolvla]
# Launch finetuning on your dataset python lerobot/scripts/train.py \ --policy.path=ShubhamK32/smolvla_so101_declutter \ --dataset.repo_id=lerobot/svla_so101_pickplace \ --batch_size=64 \ --steps=20000 \ --output_dir=outputs/train/my_smolvla \ --job_name=my_smolvla_training \ --policy.device=cuda \ --wandb.enable=true
# Run the policy using the record function
python -m lerobot.record \
--robot.type=so101_follower \
--robot.port=/dev/ttyACM0 \ # <- Use your port
--robot.id=my_blue_follower_arm \ # <- Use your robot id
--robot.cameras="{ front: {type: opencv, index_or_path: 8, width: 640, height: 480, fps: 30}}" \ # <- Use your cameras
--dataset.single_task="Grasp a lego block and put it in the bin." \ # <- Use the same task description you used in your dataset recording
--dataset.repo_id=HF_USER/dataset_name \ # <- This will be the dataset name on HF Hub
--dataset.episode_time_s=50 \
--dataset.num_episodes=10 \
--policy.path=ShubhamK32/smolvla_so101_declutterSmolVLA policy fine-tuned on the SO-101 Space Decluttering Dataset v1 for language-conditioned pick-and-place decluttering tasks on a 6-DoF SO-101 robotic arm. Trained using LeRobot.
Trained on ShubhamK32/so101_declutter_v1 β a multi-view teleoperation dataset with spatial distractors injected to prevent visual shortcut learning.
from lerobot.policies.smolvla.modeling_smolvla import SmolVLAPolicy
policy = SmolVLAPolicy.from_pretrained("ShubhamK32/smolvla_so101_declutter")
observation.images.topview β Fixed overhead. Better for unoccluded pick-place tasks.observation.images.wristview β Egocentric wrist-mounted. Better for overlapping and cluttered scenes.Base model
HuggingFaceTB/SmolLM2-1.7B