File size: 1,754 Bytes
a04e7b4
f35a3f2
 
 
 
a04e7b4
 
f35a3f2
 
 
 
 
 
a04e7b4
 
f35a3f2
a04e7b4
f35a3f2
a04e7b4
f35a3f2
a04e7b4
f35a3f2
a04e7b4
f35a3f2
 
 
 
a04e7b4
f35a3f2
a04e7b4
f35a3f2
a04e7b4
f35a3f2
a04e7b4
f35a3f2
a04e7b4
f35a3f2
a04e7b4
f35a3f2
a04e7b4
f35a3f2
a04e7b4
 
f35a3f2
 
 
 
 
 
 
 
a04e7b4
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
---
language:
- en
- zh
license: apache-2.0
library_name: transformers
tags:
- qwen3
- reward-model
- text-classification
base_model: Qwen/Qwen3-8B
pipeline_tag: text-classification
arxiv: 2601.21912
---

# Model Card for ProRAG-PRM

This is the **Process Reward Model (PRM)** associated with the ProRAG project. It is fine-tuned from [Qwen/Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B) to evaluate the quality of intermediate reasoning steps.

Based on the methodology described in the paper associated with arXiv ID: **2601.21912**.

## Model Details

- **Base Model:** Qwen3-8B
- **Type:** Process Reward Model (PRM) / Sequence Classification
- **Task:** Step-by-step Reasoning Evaluation
- **Paper:** [View on arXiv](https://arxiv.org/abs/2601.21912)

## 💻 Code & Inference

This model is designed to assign rewards/scores to reasoning steps.

For the specific scoring logic, data formatting (e.g., how to mark steps), and inference scripts, please refer to our GitHub repository:

👉 **[Click here to view the GitHub Repository](https://github.com/lilinwz/ProRAG/tree/main)**

*(Please ensure you use the correct scoring script provided in the repo, as standard Hugging Face pipelines may not interpret the process rewards correctly without specific formatting.)*

## Citation

If you use this model or the associated paper in your research, please cite:

```bibtex
@misc{wang2026proragprocesssupervisedreinforcementlearning,
      title={ProRAG: Process-Supervised Reinforcement Learning for Retrieval-Augmented Generation}, 
      author={Zhao Wang and Ziliang Zhao and Zhicheng Dou},
      year={2026},
      eprint={2601.21912},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2601.21912}, 
}
```