Add robotics pipeline tag and paper links
Browse filesHi! I'm Niels from the Hugging Face community science team. I've opened this PR to add the `robotics` pipeline tag to your model's metadata, which helps researchers find your work when filtering for robotics-related models. I've also updated the README to include direct links to the paper page, project page, and GitHub repository.
README.md
CHANGED
|
@@ -1,16 +1,17 @@
|
|
| 1 |
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
base_model: Qwen/Qwen3-VL-4B-Instruct
|
| 4 |
-
tags:
|
| 5 |
-
- reward model
|
| 6 |
-
- robot learning
|
| 7 |
-
- foundation models
|
| 8 |
library_name: transformers
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
---
|
| 10 |
|
| 11 |
# Robometer 4B LIBERO
|
| 12 |
|
| 13 |
-
**
|
| 14 |
|
| 15 |
**Robometer** is a general-purpose vision-language reward model for robotics. It is trained on [RBM-1M](https://huggingface.co/datasets/) with **Qwen3-VL-4B** to predict **per-frame progress**, **per-frame success**, and **trajectory preferences** from rollout videos. The model combines (1) frame-level progress supervision on expert data and (2) trajectory-comparison preference supervision, so it can learn from both successful and failed rollouts and generalize across diverse robot embodiments and tasks.
|
| 16 |
|
|
@@ -20,11 +21,11 @@ Given a **task instruction** and a **rollout video** (or frame sequence), the mo
|
|
| 20 |
- **Per-frame success** — success probability (or binary) at each timestep.
|
| 21 |
- **Preference / ranking** — which of two trajectories is better for the task.
|
| 22 |
|
| 23 |
-
This
|
| 24 |
|
| 25 |
### Usage
|
| 26 |
|
| 27 |
-
For full setup, example scripts, and configs, see the **GitHub repo**: [github.com/
|
| 28 |
|
| 29 |
**Option 1 — Run the model locally** (loads this checkpoint from Hugging Face):
|
| 30 |
|
|
@@ -62,7 +63,7 @@ If you use this model, please cite:
|
|
| 62 |
title={Robometer: Scaling General-Purpose Robotic Reward Models via Trajectory Comparisons},
|
| 63 |
author={Anthony Liang* and Yigit Korkmaz* and Jiahui Zhang and Minyoung Hwang and Abrar Anwar and Sidhant Kaushik and Aditya Shah and Alex S. Huang and Luke Zettlemoyer and Dieter Fox and Yu Xiang and Anqi Li and Andreea Bobu and Abhishek Gupta and Stephen Tu† and Erdem B{\i}y{\i}k† and Jesse Zhang†},
|
| 64 |
year={2025},
|
| 65 |
-
url={https://github.com/
|
| 66 |
-
note={arXiv
|
| 67 |
}
|
| 68 |
-
```
|
|
|
|
| 1 |
---
|
|
|
|
| 2 |
base_model: Qwen/Qwen3-VL-4B-Instruct
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
library_name: transformers
|
| 4 |
+
license: apache-2.0
|
| 5 |
+
pipeline_tag: robotics
|
| 6 |
+
tags:
|
| 7 |
+
- reward model
|
| 8 |
+
- robot learning
|
| 9 |
+
- foundation models
|
| 10 |
---
|
| 11 |
|
| 12 |
# Robometer 4B LIBERO
|
| 13 |
|
| 14 |
+
[**Project Page**](https://robometer.github.io/) | [**Paper**](https://huggingface.co/papers/2603.02115) | [**GitHub**](https://github.com/robometer/robometer)
|
| 15 |
|
| 16 |
**Robometer** is a general-purpose vision-language reward model for robotics. It is trained on [RBM-1M](https://huggingface.co/datasets/) with **Qwen3-VL-4B** to predict **per-frame progress**, **per-frame success**, and **trajectory preferences** from rollout videos. The model combines (1) frame-level progress supervision on expert data and (2) trajectory-comparison preference supervision, so it can learn from both successful and failed rollouts and generalize across diverse robot embodiments and tasks.
|
| 17 |
|
|
|
|
| 21 |
- **Per-frame success** — success probability (or binary) at each timestep.
|
| 22 |
- **Preference / ranking** — which of two trajectories is better for the task.
|
| 23 |
|
| 24 |
+
This specific checkpoint is trained on LIBERO-10/Spatial/Object/Goal.
|
| 25 |
|
| 26 |
### Usage
|
| 27 |
|
| 28 |
+
For full setup, example scripts, and configs, see the **GitHub repo**: [github.com/robometer/robometer](https://github.com/robometer/robometer).
|
| 29 |
|
| 30 |
**Option 1 — Run the model locally** (loads this checkpoint from Hugging Face):
|
| 31 |
|
|
|
|
| 63 |
title={Robometer: Scaling General-Purpose Robotic Reward Models via Trajectory Comparisons},
|
| 64 |
author={Anthony Liang* and Yigit Korkmaz* and Jiahui Zhang and Minyoung Hwang and Abrar Anwar and Sidhant Kaushik and Aditya Shah and Alex S. Huang and Luke Zettlemoyer and Dieter Fox and Yu Xiang and Anqi Li and Andreea Bobu and Abhishek Gupta and Stephen Tu† and Erdem B{\i}y{\i}k† and Jesse Zhang†},
|
| 65 |
year={2025},
|
| 66 |
+
url={https://github.com/robometer/robometer},
|
| 67 |
+
note={arXiv:2603.02115}
|
| 68 |
}
|
| 69 |
+
```
|