Reward Model Card for robometer

Robometer is a zero-shot general-purpose robotic reward model built on a fine-tuned Qwen3-VL backbone with progress, preference, and success heads. Given a video and a task description it outputs a per-frame progress signal in [0, 1] and a per-frame success probability — suitable for offline reward labelling and for low-frequency reward signals during RL fine-tuning of robot policies.

This reward model has been trained and pushed to the Hub using LeRobot. See the full documentation at LeRobot Docs.


How to Get Started with the Reward Model

Train from scratch

lerobot-train \
  --dataset.repo_id=${HF_USER}/<dataset> \
  --reward_model.type=robometer \
  --output_dir=outputs/train/<desired_reward_model_repo_id> \
  --job_name=lerobot_reward_training \
  --reward_model.device=cuda \
  --reward_model.repo_id=${HF_USER}/<desired_reward_model_repo_id> \
  --wandb.enable=true

Writes checkpoints to outputs/train/<desired_reward_model_repo_id>/checkpoints/.

Load the reward model in Python

from lerobot.rewards import make_reward_model

reward_model = make_reward_model(pretrained_path="<hf_user>/<reward_model_repo_id>")
reward = reward_model.compute_reward(batch)

Model Details

  • License: apache-2.0
Downloads last month
49
Safetensors
Model size
4B params
Tensor type
BF16
·
Video Preview
loading