Safetensors
English
qwen3
Suu commited on
Commit
626667e
Β·
verified Β·
1 Parent(s): 0944f64

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -0
README.md CHANGED
@@ -14,6 +14,15 @@ metrics:
14
  # ✨ Klear-Reasoner-8B
15
  We present Klear-Reasoner, a model with long reasoning capabilities that demonstrates careful deliberation during problem solving, achieving outstanding performance across multiple benchmarks. We investigate two key issues with current clipping mechanisms in RL: Clipping suppresses critical exploration signals and ignores suboptimal trajectories. To address these challenges, we propose **G**radient-**P**reserving clipping **P**olicy **O**ptimization (**GPPO**) that gently backpropagates gradients from clipped tokens.
16
 
 
 
 
 
 
 
 
 
 
17
 
18
  ## πŸ“Œ Overview
19
 
 
14
  # ✨ Klear-Reasoner-8B
15
  We present Klear-Reasoner, a model with long reasoning capabilities that demonstrates careful deliberation during problem solving, achieving outstanding performance across multiple benchmarks. We investigate two key issues with current clipping mechanisms in RL: Clipping suppresses critical exploration signals and ignores suboptimal trajectories. To address these challenges, we propose **G**radient-**P**reserving clipping **P**olicy **O**ptimization (**GPPO**) that gently backpropagates gradients from clipped tokens.
16
 
17
+ | Resource | Link |
18
+ |---|---|
19
+ | πŸ“ Preprints | [Paper](https://arxiv.org/pdf/2508.07629) |
20
+ | πŸ€— Daily Paper | [Paper](https://huggingface.co/papers/2508.07629) |
21
+ | πŸ€— Model Hub | [Klear-Reasoner-8B](https://huggingface.co/Suu/Klear-Reasoner-8B) |
22
+ | πŸ€— Dataset Hub | [Math RL](https://huggingface.co/datasets/Suu/KlearReasoner-MathSub-30K) |
23
+ | πŸ€— Dataset Hub | [Code RL](https://huggingface.co/datasets/Suu/KlearReasoner-CodeSub-15K) |
24
+ | πŸ› Issues & Discussions | [GitHub Issues](https://github.com/suu990901/KlearReasoner/issues) |
25
+ | πŸ“§ Contact | suzhenpeng13@163.com |
26
 
27
  ## πŸ“Œ Overview
28