teetone
/

RoboReward-4B

Model card Files Files and versions

RoboReward-4B / README.md

teetone's picture

Update README.md

4dec8af verified 7 days ago

|

history blame contribute delete

2.08 kB

	---
	license: cc-by-4.0
	datasets:
	- teetone/RoboReward
	language:
	- en
	base_model:
	- Qwen/Qwen3-VL-4B-Instruct
	---

	# RoboReward 4B

	Paper: [https://arxiv.org/abs/2601.00675](https://arxiv.org/abs/2601.00675)

	RoboReward provides general-purpose vision-language reward model for robotics, trained on the [RoboReward dataset](https://huggingface.co/datasets/teetone/RoboReward) with Qwen-3 VL to predict discrete end-of-episode progress rewards from real-robot rollout videos.


	## Usage

	### Purpose

	Given a task instruction and a rollout video, the model predicts an end-of-episode progress score:
	- 1: No success
	- 2: Minimal progress
	- 3: Partial completion
	- 4: Near completion
	- 5: Perfect completion

	### Inference

	Follow the [original Qwen 3-VL instructions with video input](https://huggingface.co/Qwen/Qwen3-VL-4B-Instruct) and use a text prompt like this:

	```text
	Given the task, assign a discrete progress score reward (1,2,3,4,5) for the robot in the video in the format: ANSWER: <score>
	Rubric for end-of-episode progress (judge only the final state without time limits):
	1 - No Success: Final state shows no goal-relevant change for the command.
	2 - Minimal Progress: Final state shows a small but insufficient change toward the goal.
	3 - Partial Completion: The final state shows good progress toward the goal but violates more than one requirement or a major requirement.
	4 - Near Completion: Final state is correct in region and intent but misses a single minor requirement.
	5 - Perfect Completion: Final state satisfies all requirements.

	Task: <INSERT TASK HERE>
	```

	## Citation

	```bibtex
	@misc{lee2026roborewardgeneralpurposevisionlanguagereward,
	title={RoboReward: General-Purpose Vision-Language Reward Models for Robotics},
	author={Tony Lee and Andrew Wagenmaker and Karl Pertsch and Percy Liang and Sergey Levine and Chelsea Finn},
	year={2026},
	eprint={2601.00675},
	archivePrefix={arXiv},
	primaryClass={cs.RO},
	url={https://arxiv.org/abs/2601.00675},
	}
	```