GenPRM
/

GenPRM-7B

RyanLiu112 commited on Apr 5, 2025

Commit

ad84702

verified ·

1 Parent(s): 947c96c

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -8,12 +8,12 @@ base_model:
 # Introduction
-We propose GenPRM, a strong generative process reward model with the following features:
-- reasoning with explicit CoT and code verfication before providing the process judgment;
-- improving Monte Carlo estimation and hard label with Relative Progress Estimation (RPE);
-- supporting GenPRM test-time scaling in a parallel manner with majority voting;
-- supporting policy model test-time scaling with GenPRM as verifiers or critics.
 GenPRM achieves state-of-the-art performance across multiple benchmarks in two key roles:
@@ -26,8 +26,8 @@ GenPRM achieves state-of-the-art performance across multiple benchmarks in two k
 - Paper: [https://arxiv.org/abs/2504.00891](https://arxiv.org/abs/2504.00891)
 - Code: [https://github.com/RyanLiu112/GenPRM](https://github.com/RyanLiu112/GenPRM)
 - Awesome Process Reward Models: [Awesome Process Reward Models](https://github.com/RyanLiu112/Awesome-Process-Reward-Models)
-- HF Paper Link：[GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning](https://hf.co/papers/2504.00891)
-- HF Collection：[GenPRM](https://hf.co/collections/GenPRM/genprm-67ee4936234ba5dd16bb9943)
 # Model details

 # Introduction
+We propose **GenPRM**, a strong generative process reward model with the following features:
+- reasoning with explicit **CoT reasoning** and **code verfication** before providing the process judgment;
+- improving Monte Carlo estimation and hard label with **Relative Progress Estimation (RPE)**;
+- supporting GenPRM **test-time scaling** in a parallel manner with majority voting;
+- supporting policy model test-time scaling with GenPRM as **verifiers** or **critics**.
 GenPRM achieves state-of-the-art performance across multiple benchmarks in two key roles:
 - Paper: [https://arxiv.org/abs/2504.00891](https://arxiv.org/abs/2504.00891)
 - Code: [https://github.com/RyanLiu112/GenPRM](https://github.com/RyanLiu112/GenPRM)
 - Awesome Process Reward Models: [Awesome Process Reward Models](https://github.com/RyanLiu112/Awesome-Process-Reward-Models)
+- HF Paper Link: [GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning](https://hf.co/papers/2504.00891)
+- HF Collection: [GenPRM](https://hf.co/collections/GenPRM/genprm-67ee4936234ba5dd16bb9943)
 # Model details