Update README.md
Browse files
README.md
CHANGED
|
@@ -14,6 +14,15 @@ metrics:
|
|
| 14 |
# β¨ Klear-Reasoner-8B
|
| 15 |
We present Klear-Reasoner, a model with long reasoning capabilities that demonstrates careful deliberation during problem solving, achieving outstanding performance across multiple benchmarks. We investigate two key issues with current clipping mechanisms in RL: Clipping suppresses critical exploration signals and ignores suboptimal trajectories. To address these challenges, we propose **G**radient-**P**reserving clipping **P**olicy **O**ptimization (**GPPO**) that gently backpropagates gradients from clipped tokens.
|
| 16 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 17 |
|
| 18 |
## π Overview
|
| 19 |
|
|
|
|
| 14 |
# β¨ Klear-Reasoner-8B
|
| 15 |
We present Klear-Reasoner, a model with long reasoning capabilities that demonstrates careful deliberation during problem solving, achieving outstanding performance across multiple benchmarks. We investigate two key issues with current clipping mechanisms in RL: Clipping suppresses critical exploration signals and ignores suboptimal trajectories. To address these challenges, we propose **G**radient-**P**reserving clipping **P**olicy **O**ptimization (**GPPO**) that gently backpropagates gradients from clipped tokens.
|
| 16 |
|
| 17 |
+
| Resource | Link |
|
| 18 |
+
|---|---|
|
| 19 |
+
| π Preprints | [Paper](https://arxiv.org/pdf/2508.07629) |
|
| 20 |
+
| π€ Daily Paper | [Paper](https://huggingface.co/papers/2508.07629) |
|
| 21 |
+
| π€ Model Hub | [Klear-Reasoner-8B](https://huggingface.co/Suu/Klear-Reasoner-8B) |
|
| 22 |
+
| π€ Dataset Hub | [Math RL](https://huggingface.co/datasets/Suu/KlearReasoner-MathSub-30K) |
|
| 23 |
+
| π€ Dataset Hub | [Code RL](https://huggingface.co/datasets/Suu/KlearReasoner-CodeSub-15K) |
|
| 24 |
+
| π Issues & Discussions | [GitHub Issues](https://github.com/suu990901/KlearReasoner/issues) |
|
| 25 |
+
| π§ Contact | suzhenpeng13@163.com |
|
| 26 |
|
| 27 |
## π Overview
|
| 28 |
|