metadata
license: apache-2.0
base_model: Qwen/Qwen3-VL-4B-Instruct
tags:
- reward_model
- rfm
- preference_comparisons
library_name: transformers
rewardfm/libero_testset_prog_4frames_fixdata
Model Details
- Base Model: Qwen/Qwen3-VL-4B-Instruct
- Model Type: qwen3_vl
Training Run
- Wandb Run: libero_ablation_prog_4frames_fixdata
- Wandb ID:
ds6utsjz - Project: rfm
- Notes: libero prog only
Citation
If you use this model, please cite: