gaotang
/

RM-R1-Qwen2.5-Instruct-14B

Text Generation

text-generation-inference

Model card Files Files and versions

gaotang commited on May 20, 2025

Commit

8a789a5

·

verified ·

1 Parent(s): 43dbbc4

Update README.md

Files changed (1) hide show

README.md +7 -6

README.md CHANGED Viewed

@@ -37,7 +37,8 @@ Compared to traditional scalar or generative reward models, RM-R1 delivers **sta
 ## 🔍 Demo Code
 Try the model with this example. Full demo notebook available at:
-📎 GitHub: demo/demo.ipynb
 ### 🧾 Prompt Template
@@ -154,10 +155,10 @@ print(completion)
 ## Citations
 ```bibtex
-@misc{2505.02387,
-Author = {Xiusi Chen and Gaotang Li and Ziqi Wang and Bowen Jin and Cheng Qian and Yu Wang and Hongru Wang and Yu Zhang and Denghui Zhang and Tong Zhang and Hanghang Tong and Heng Ji},
-Title = {RM-R1: Reward Modeling as Reasoning},
-Year = {2025},
-Eprint = {arXiv:2505.02387},
 }
 ```

 ## 🔍 Demo Code
 Try the model with this example. Full demo notebook available at:
+📎 [Official Demo Link](https://github.com/RM-R1-UIUC/RM-R1/blob/main/demo/demo.ipynb)
 ### 🧾 Prompt Template
 ## Citations
 ```bibtex
+@article{chen2025rm,
+  title={RM-R1: Reward Modeling as Reasoning},
+  author={Chen, Xiusi and Li, Gaotang and Wang, Ziqi and Jin, Bowen and Qian, Cheng and Wang, Yu and Wang, Hongru and Zhang, Yu and Zhang, Denghui and Zhang, Tong and others},
+  journal={arXiv preprint arXiv:2505.02387},
+  year={2025}
 }
 ```