X-VLA SO-101 Phase II - All Checkpoints
Fine-tuned X-VLA model checkpoints for SO-101 robot arm pick-and-place task.
Model Details
- Base model: lerobot/xvla-base
- Training steps: 200,000 total
- Task: Pick up cube and place in bin
- Robot: SO-101 single arm
- Action space: Delta position control (4D: x, y, z, gripper)
- Domain ID: 0 (WidowX-compatible)
Available Checkpoints
| Checkpoint | Steps | Path |
|---|---|---|
| 020000 | 20,000 | 020000/pretrained_model/ |
| 040000 | 40,000 | 040000/pretrained_model/ |
| 060000 | 60,000 | 060000/pretrained_model/ |
| 080000 | 80,000 | 080000/pretrained_model/ |
| 100000 | 100,000 | 100000/pretrained_model/ |
| 120000 | 120,000 | 120000/pretrained_model/ |
| 140000 | 140,000 | 140000/pretrained_model/ |
| 160000 | 160,000 | 160000/pretrained_model/ |
| 180000 | 180,000 | 180000/pretrained_model/ |
| 200000 | 200,000 | 200000/pretrained_model/ |
Training Configuration
- Frozen: Vision encoder, Language encoder
- Trained: Policy transformer, Soft prompts, Action heads
- Loss: L1 for XYZ, BCE for gripper
- LR: 1e-4 β 1e-5 with warmup
Best Checkpoint
The 200000 checkpoint is recommended - it achieves:
| Phase | Status |
|---|---|
| Approach cube | β Works |
| Grasp cube | β Works |
| Place in bin | β οΈ Partial |
Usage
from lerobot.common.policies.xvla.modeling_xvla import XVLAPolicy
# Load best checkpoint (200k)
policy = XVLAPolicy.from_pretrained(
"gpudad/xvla-so101-phase2-checkpoints",
subfolder="200000/pretrained_model"
)
# Or load an earlier checkpoint
policy = XVLAPolicy.from_pretrained(
"gpudad/xvla-so101-phase2-checkpoints",
subfolder="100000/pretrained_model"
)
Evaluation Tips
- Use
n_action_steps=4for faster re-querying (better performance) - Model works best with 128x128 images (front + wrist cameras)
- Language instruction: "pick up the cube and place it in the bin"
Files Structure
βββ 020000/
β βββ pretrained_model/
β βββ model.safetensors
β βββ config.json
β βββ ...
βββ 040000/
β βββ pretrained_model/
βββ ...
βββ 200000/
βββ pretrained_model/
Citation
Based on X-VLA from LeRobot.