Yirany commited on
Commit
9528653
·
verified ·
1 Parent(s): 6d5295f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -6
README.md CHANGED
@@ -8,7 +8,7 @@ language:
8
 
9
  # Model Card for RLPR-Gemma2-2B-it
10
 
11
- [GitHub](https://github.com/openbmb/RLPR) | [Paper](https://github.com/OpenBMB/RLPR/blob/main/RLPR_paper.pdf)
12
 
13
  **RLPR-Gemma2-2B-it** is trained from Gemma2-2B-it with the [RLPR](https://github.com/openbmb/RLPR) framework, which eliminates reliance on external verifiers and is simple and generalizable for more domains.
14
 
@@ -61,10 +61,13 @@ print(tokenizer.decode(outputs[0]))
61
  If you find our model/code/paper helpful, please consider citing our papers 📝:
62
 
63
  ```bibtex
64
- @article{yu2025rlpr,
65
- title={RLPR: Extrapolating RLVR to General Domains without Verifiers},
66
- author={Yu, Tianyu and Ji, Bo and Wang, Shouli and Yao, Shu and Wang, Zefan and Cui, Ganqu and Yuan, Lifan and Ding, Ning and Yao, Yuan and Liu, Zhiyuan and Sun, Maosong and Chua, Tat-Seng},
67
- journal={arXiv preprint arXiv:2506.xxxxx},
68
- year={2025}
 
 
 
69
  }
70
  ```
 
8
 
9
  # Model Card for RLPR-Gemma2-2B-it
10
 
11
+ [GitHub](https://github.com/openbmb/RLPR) | [Paper](https://arxiv.org/abs/2506.18254)
12
 
13
  **RLPR-Gemma2-2B-it** is trained from Gemma2-2B-it with the [RLPR](https://github.com/openbmb/RLPR) framework, which eliminates reliance on external verifiers and is simple and generalizable for more domains.
14
 
 
61
  If you find our model/code/paper helpful, please consider citing our papers 📝:
62
 
63
  ```bibtex
64
+ @misc{yu2025rlprextrapolatingrlvrgeneral,
65
+ title={RLPR: Extrapolating RLVR to General Domains without Verifiers},
66
+ author={Tianyu Yu and Bo Ji and Shouli Wang and Shu Yao and Zefan Wang and Ganqu Cui and Lifan Yuan and Ning Ding and Yuan Yao and Zhiyuan Liu and Maosong Sun and Tat-Seng Chua},
67
+ year={2025},
68
+ eprint={2506.18254},
69
+ archivePrefix={arXiv},
70
+ primaryClass={cs.LG},
71
+ url={https://arxiv.org/abs/2506.18254},
72
  }
73
  ```