Crab SmolVLA โ€” SA-RWFM Teacher (Tactile)

Fine-tuned SmolVLA with Sensitivity-Aware Reward-Weighted Flow Matching (SA-RWFM) and dual tactile sensors for right-arm manipulation on the Crab robot.

This model serves as the tactile-conditioned teacher for knowledge distillation into HapticsVLA.

Model Details

  • Base model: lerobot/smolvla_base (450M params) + DualTactileEncoder
  • Action space: 6-DOF absolute joint positions (indices 6โ€“11)
  • State input: 6D proprioception + 128D tactile embedding (2ร—10ร—10 force matrices)
  • Training data: 27 demonstrations + reward labels across 3 tasks
  • Best validation loss: 6.56 (note: RWFM loss is not directly comparable to standard MSE)
  • Training: 50K steps, RTX 5090, ~4 hrs

Key Features

  • Dual tactile sensing: Processes left and right 10ร—10 tactile force matrices
  • Reward-weighted flow matching: Upweights successful demonstrations, downweights failures
  • Anchor regularization: Prevents reward weight collapse

Performance (Sync Mode, 20 trials per task)

Task Success Rate Force Errors
Eggs 85% 3/20
Can 55% 3/20
Waffles 85% 1/20
Mean 75.0% 7/60

Note: This model requires tactile sensor hardware at inference. For a tactile-free alternative with better performance, see HapticsVLA.

Usage

import torch
checkpoint = torch.load("best/model.pt", map_location="cpu")

See Advanced-Robotic-Manipulation/crab for full inference pipeline.

Citation

If you use this model, please cite our paper (coming soon).

Downloads last month
-
Video Preview
loading

Model tree for armteam/crab-smolvla-rwfm

Finetuned
(4948)
this model