SmolVLA Fine-tuned on LIBERO-Spatial

This is a fine-tuned version of lerobot/smolvla_base trained on the LIBERO-Spatial benchmark using the LeRobot framework.

Demo Video

Task 8 success episode (70% success rate on this task):

Model Details

Base model: lerobot/smolvla_base
Parameters: 450M total (100M trainable action expert)
Training steps: 20,000
Batch size: 8
Hardware: NVIDIA L4 24GB (Google Colab Pro)
Training time: ~2.5 hours

Performance on LIBERO-Spatial

Task	Success Rate
task_0	60%
task_1	50%
task_2	60%
task_3	10%
task_4	20%
task_5	20%
task_6	10%
task_7	30%
task_8	70%
task_9	30%
Overall	36%

Training Command

lerobot-train \
  --policy.type=smolvla \
  --policy.pretrained_path=lerobot/smolvla_base \
  --dataset.repo_id=HuggingFaceVLA/libero \
  --batch_size=8 \
  --steps=20000 \
  --seed=42

Ablation Study — Training Duration

We evaluated checkpoints at multiple steps to understand convergence:

Training Steps	Success Rate
2,000	2%
6,000	17%
10,000	31%
20,000	36%

Performance improves consistently but with diminishing returns, suggesting convergence begins around 10K steps on LIBERO-Spatial.

Framework

Downloads last month: 18

Safetensors

Model size

0.5B params

Tensor type

F32

BF16

Video Preview

Robotics

Beeface
/

smolvla-libero-spatial