Crab SmolVLA โ SA-RWFM Teacher (Tactile)
Fine-tuned SmolVLA with Sensitivity-Aware Reward-Weighted Flow Matching (SA-RWFM) and dual tactile sensors for right-arm manipulation on the Crab robot.
This model serves as the tactile-conditioned teacher for knowledge distillation into HapticsVLA.
Model Details
- Base model:
lerobot/smolvla_base(450M params) + DualTactileEncoder - Action space: 6-DOF absolute joint positions (indices 6โ11)
- State input: 6D proprioception + 128D tactile embedding (2ร10ร10 force matrices)
- Training data: 27 demonstrations + reward labels across 3 tasks
- Best validation loss: 6.56 (note: RWFM loss is not directly comparable to standard MSE)
- Training: 50K steps, RTX 5090, ~4 hrs
Key Features
- Dual tactile sensing: Processes left and right 10ร10 tactile force matrices
- Reward-weighted flow matching: Upweights successful demonstrations, downweights failures
- Anchor regularization: Prevents reward weight collapse
Performance (Sync Mode, 20 trials per task)
| Task | Success Rate | Force Errors |
|---|---|---|
| Eggs | 85% | 3/20 |
| Can | 55% | 3/20 |
| Waffles | 85% | 1/20 |
| Mean | 75.0% | 7/60 |
Note: This model requires tactile sensor hardware at inference. For a tactile-free alternative with better performance, see HapticsVLA.
Usage
import torch
checkpoint = torch.load("best/model.pt", map_location="cpu")
See Advanced-Robotic-Manipulation/crab for full inference pipeline.
Citation
If you use this model, please cite our paper (coming soon).
- Downloads last month
- -
Model tree for armteam/crab-smolvla-rwfm
Base model
lerobot/smolvla_base