Kwai-Klear
/

Klear-Reasoner-8B

Model card Files Files and versions

Suu commited on Aug 11, 2025

Commit

7df4401

·

verified ·

1 Parent(s): 2de7cbf

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -6,14 +6,14 @@ base_model:
 - Qwen/Qwen3-8B-Base
 ---
-# ✨ Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization
 We present Klear-Reasoner, a model with long reasoning capabilities that demonstrates careful deliberation during problem solving, achieving outstanding performance across multiple benchmarks. We investigate two key issues with current clipping mechanisms in RL: Clipping suppresses critical exploration signals and ignores suboptimal trajectories. To address these challenges, we propose **G**radient-**P**reserving clipping **P**olicy **O**ptimization (**GPPO**) that gently backpropagates gradients from clipped tokens.
 ## 📌 Overview
 <div align="center">
-<img src="https://github.com/suu990901/KlearReasoner/blob/main/docker/main_result.png" width="100%"/>
 <sub>Benchmark accuracy of Klear-Reasoner-8B on AIME 2024/2025 (avg@64), LiveCodeBench V5 (2024/08/01-2025/02/01, avg@8), and v6 (2025/02/01-2025/05/01, avg@8).</sub>
 </div>

 - Qwen/Qwen3-8B-Base
 ---
+# ✨ Klear-Reasoner-8B
 We present Klear-Reasoner, a model with long reasoning capabilities that demonstrates careful deliberation during problem solving, achieving outstanding performance across multiple benchmarks. We investigate two key issues with current clipping mechanisms in RL: Clipping suppresses critical exploration signals and ignores suboptimal trajectories. To address these challenges, we propose **G**radient-**P**reserving clipping **P**olicy **O**ptimization (**GPPO**) that gently backpropagates gradients from clipped tokens.
 ## 📌 Overview
 <div align="center">
+<img src="assets/main_result.png" width="100%"/>
 <sub>Benchmark accuracy of Klear-Reasoner-8B on AIME 2024/2025 (avg@64), LiveCodeBench V5 (2024/08/01-2025/02/01, avg@8), and v6 (2025/02/01-2025/05/01, avg@8).</sub>
 </div>