ob11 commited on
Commit
99d3938
·
verified ·
1 Parent(s): c258863

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -8
README.md CHANGED
@@ -11,9 +11,9 @@ datasets:
11
 
12
  > Qwen-VL-PRM-3B is a process reward model finetuned from Qwen2.5-3B-Instruct on approximately 300,000 examples. It demonstrates strong test-time scaling performance improvements on various advanced multimodal reasoning benchmarks when used with Qwen2.5-VL and Gemma-3 models despite being trained mainly on abstract reasoning datasets and elementary reasoning datasets.
13
 
14
- - **Logs:** https://wandb.ai/aisg-arf/multimodal-reasoning/runs/pnsncs80/
15
- - **Repository:** [ob11/vlprm](https://github.com/theogbrand/vlprm/)
16
- - **Paper:** https://arxiv.org/abs/
17
 
18
  # Use
19
 
@@ -59,12 +59,12 @@ The model usage is documented [here](https://github.com/theogbrand/vlprm/blob/ma
59
 
60
  ```bibtex
61
  @misc{ong2025vlprms,
62
- title={VL-PRMs: Vision-Language Process Reward Models},
63
- author={Brandon Ong, Tej Deep Pala, Vernon Toh, William Chandra Tjhi and Soujanya Poria},
64
  year={2025},
65
- eprint={},
66
  archivePrefix={arXiv},
67
- primaryClass={cs.CL},
68
- url={},
69
  }
70
  ```
 
11
 
12
  > Qwen-VL-PRM-3B is a process reward model finetuned from Qwen2.5-3B-Instruct on approximately 300,000 examples. It demonstrates strong test-time scaling performance improvements on various advanced multimodal reasoning benchmarks when used with Qwen2.5-VL and Gemma-3 models despite being trained mainly on abstract reasoning datasets and elementary reasoning datasets.
13
 
14
+ - **Logs:** https://wandb.ai/aisg-arf/multimodal-reasoning/runs/pnsncs80
15
+ - **Repository:** https://github.com/theogbrand/vlprm
16
+ - **Paper:** https://arxiv.org/pdf/2509.23250
17
 
18
  # Use
19
 
 
59
 
60
  ```bibtex
61
  @misc{ong2025vlprms,
62
+ title={Training Vision-Language Process Reward Models for Test-Time Scaling in Multimodal Reasoning: Key Insights and Lessons Learned},
63
+ author={Brandon Ong, Tej Deep Pala, Vernon Toh, William Chandra Tjhi, and Soujanya Poria},
64
  year={2025},
65
+ eprint={2509.23250},
66
  archivePrefix={arXiv},
67
+ primaryClass={cs.AI},
68
+ url={https://arxiv.org/pdf/2509.23250},
69
  }
70
  ```