teetone commited on
Commit
3a185b4
·
verified ·
1 Parent(s): 18bd333

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +50 -1
README.md CHANGED
@@ -6,4 +6,53 @@ language:
6
  - en
7
  base_model:
8
  - Qwen/Qwen3-VL-8B-Instruct
9
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  - en
7
  base_model:
8
  - Qwen/Qwen3-VL-8B-Instruct
9
+ ---
10
+
11
+
12
+ # RoboReward 8B
13
+
14
+ **Paper:** [https://arxiv.org/abs/2601.00675](https://arxiv.org/abs/2601.00675)
15
+
16
+ RoboReward provides **general-purpose vision-language reward model for robotics**, trained on the [RoboReward dataset](https://huggingface.co/datasets/teetone/RoboReward) with **Qwen-3 VL** to predict **discrete end-of-episode progress rewards** from real-robot rollout videos.
17
+
18
+
19
+ ## Usage
20
+
21
+ ### Purpose
22
+
23
+ Given a **task instruction** and a **rollout video**, the model predicts an end-of-episode progress score:
24
+ - **1:** No success
25
+ - **2:** Minimal progress
26
+ - **3:** Partial completion
27
+ - **4:** Near completion
28
+ - **5:** Perfect completion
29
+
30
+ ### Inference
31
+
32
+ Follow the [original Qwen 3-VL instructions with video input](https://huggingface.co/Qwen/Qwen3-VL-8B-Instruct) and use a text prompt like this:
33
+
34
+ ```text
35
+ Given the task, assign a discrete progress score reward (1,2,3,4,5) for the robot in the video in the format: ANSWER: <score>
36
+ Rubric for end-of-episode progress (judge only the final state without time limits):
37
+ 1 - No Success: Final state shows no goal-relevant change for the command.
38
+ 2 - Minimal Progress: Final state shows a small but insufficient change toward the goal.
39
+ 3 - Partial Completion: The final state shows good progress toward the goal but violates more than one requirement or a major requirement.
40
+ 4 - Near Completion: Final state is correct in region and intent but misses a single minor requirement.
41
+ 5 - Perfect Completion: Final state satisfies all requirements.
42
+
43
+ Task: <INSERT TASK HERE>
44
+ ```
45
+
46
+ ## Citation
47
+
48
+ ```bibtex
49
+ @misc{lee2026roborewardgeneralpurposevisionlanguagereward,
50
+ title={RoboReward: General-Purpose Vision-Language Reward Models for Robotics},
51
+ author={Tony Lee and Andrew Wagenmaker and Karl Pertsch and Percy Liang and Sergey Levine and Chelsea Finn},
52
+ year={2026},
53
+ eprint={2601.00675},
54
+ archivePrefix={arXiv},
55
+ primaryClass={cs.RO},
56
+ url={https://arxiv.org/abs/2601.00675},
57
+ }
58
+ ```