gaotang commited on
Commit
8a789a5
·
verified ·
1 Parent(s): 43dbbc4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -6
README.md CHANGED
@@ -37,7 +37,8 @@ Compared to traditional scalar or generative reward models, RM-R1 delivers **sta
37
  ## 🔍 Demo Code
38
 
39
  Try the model with this example. Full demo notebook available at:
40
- 📎 GitHub: demo/demo.ipynb
 
41
 
42
  ### 🧾 Prompt Template
43
 
@@ -154,10 +155,10 @@ print(completion)
154
  ## Citations
155
 
156
  ```bibtex
157
- @misc{2505.02387,
158
- Author = {Xiusi Chen and Gaotang Li and Ziqi Wang and Bowen Jin and Cheng Qian and Yu Wang and Hongru Wang and Yu Zhang and Denghui Zhang and Tong Zhang and Hanghang Tong and Heng Ji},
159
- Title = {RM-R1: Reward Modeling as Reasoning},
160
- Year = {2025},
161
- Eprint = {arXiv:2505.02387},
162
  }
163
  ```
 
37
  ## 🔍 Demo Code
38
 
39
  Try the model with this example. Full demo notebook available at:
40
+
41
+ 📎 [Official Demo Link](https://github.com/RM-R1-UIUC/RM-R1/blob/main/demo/demo.ipynb)
42
 
43
  ### 🧾 Prompt Template
44
 
 
155
  ## Citations
156
 
157
  ```bibtex
158
+ @article{chen2025rm,
159
+ title={RM-R1: Reward Modeling as Reasoning},
160
+ author={Chen, Xiusi and Li, Gaotang and Wang, Ziqi and Jin, Bowen and Qian, Cheng and Wang, Yu and Wang, Hongru and Zhang, Yu and Zhang, Denghui and Zhang, Tong and others},
161
+ journal={arXiv preprint arXiv:2505.02387},
162
+ year={2025}
163
  }
164
  ```