Update README.md
Browse files
README.md
CHANGED
|
@@ -5,11 +5,10 @@ base_model:
|
|
| 5 |
---
|
| 6 |
<div align="center">
|
| 7 |
|
| 8 |
-
# UloRL
|
| 9 |
-
|
| 10 |
<div>
|
| 11 |
An <strong>U</strong>ltra-<strong>L</strong>ong <strong>O</strong>utput <strong>R</strong>einforcement <strong>L</strong>earning Approach for Advancing Large Language Models' Reasoning Abilities
|
| 12 |
</div>
|
|
|
|
| 13 |
</div>
|
| 14 |
|
| 15 |
## Overview
|
|
|
|
| 5 |
---
|
| 6 |
<div align="center">
|
| 7 |
|
|
|
|
|
|
|
| 8 |
<div>
|
| 9 |
An <strong>U</strong>ltra-<strong>L</strong>ong <strong>O</strong>utput <strong>R</strong>einforcement <strong>L</strong>earning Approach for Advancing Large Language Models' Reasoning Abilities
|
| 10 |
</div>
|
| 11 |
+
<a href="https://arxiv.org/pdf/2507.19766" target="_blank">Paper</a> | <a href="https://github.com/liushulinle/UloRL" target="_blank">GitHub</a>
|
| 12 |
</div>
|
| 13 |
|
| 14 |
## Overview
|