Update README.md
Browse files
README.md
CHANGED
|
@@ -9,9 +9,9 @@ library_name: transformers
|
|
| 9 |
|
| 10 |
# Model Card for RLFR-Qwen2.5-Math-7B
|
| 11 |
|
| 12 |
-
[GitHub](https://github.com/
|
| 13 |
|
| 14 |
-
**RLFR-Qwen2.5-Math-7B** is trained from Qwen2.5-Math-7B with the [RLFR](https://github.com/
|
| 15 |
|
| 16 |
## Model Details
|
| 17 |
|
|
|
|
| 9 |
|
| 10 |
# Model Card for RLFR-Qwen2.5-Math-7B
|
| 11 |
|
| 12 |
+
[GitHub](https://github.com/Jinghaoleven/RLFR) | [Paper](https://arxiv.org) | [WebPage](jinghaoleven.github.io/RLFR)
|
| 13 |
|
| 14 |
+
**RLFR-Qwen2.5-Math-7B** is trained from Qwen2.5-Math-7B with the [RLFR](https://github.com/Jinghaoleven/RLFR) framework, which introduces the flow reward derived from latent space, extending RLVR with latent reward utilization.
|
| 15 |
|
| 16 |
## Model Details
|
| 17 |
|