SmolVLA Fine-tuned on LIBERO-Spatial
This is a fine-tuned version of lerobot/smolvla_base trained on the LIBERO-Spatial benchmark using the LeRobot framework.
Demo Video
Task 8 success episode (70% success rate on this task):
Model Details
- Base model: lerobot/smolvla_base
- Parameters: 450M total (100M trainable action expert)
- Training steps: 20,000
- Batch size: 8
- Hardware: NVIDIA L4 24GB (Google Colab Pro)
- Training time: ~2.5 hours
Performance on LIBERO-Spatial
| Task | Success Rate |
|---|---|
| task_0 | 60% |
| task_1 | 50% |
| task_2 | 60% |
| task_3 | 10% |
| task_4 | 20% |
| task_5 | 20% |
| task_6 | 10% |
| task_7 | 30% |
| task_8 | 70% |
| task_9 | 30% |
| Overall | 36% |
Training Command
lerobot-train \
--policy.type=smolvla \
--policy.pretrained_path=lerobot/smolvla_base \
--dataset.repo_id=HuggingFaceVLA/libero \
--batch_size=8 \
--steps=20000 \
--seed=42
Ablation Study — Training Duration
We evaluated checkpoints at multiple steps to understand convergence:
| Training Steps | Success Rate |
|---|---|
| 2,000 | 2% |
| 6,000 | 17% |
| 10,000 | 31% |
| 20,000 | 36% |
Performance improves consistently but with diminishing returns, suggesting convergence begins around 10K steps on LIBERO-Spatial.
Framework
- Downloads last month
- 38