Update README.md
Browse files
README.md
CHANGED
|
@@ -26,7 +26,7 @@ library_name: transformers
|
|
| 26 |
|
| 27 |
</div>
|
| 28 |
|
| 29 |
-
This work is a companion to our earlier report [**
|
| 30 |
|
| 31 |
***
|
| 32 |
|
|
|
|
| 26 |
|
| 27 |
</div>
|
| 28 |
|
| 29 |
+
This work is a companion to our earlier report [**HiPO: Hybrid Policy Optimization for Dynamic Reasoning in LLMs**](https://arxiv.org/abs/2509.23967), where we first introduced the **AutoThink paradigm** for controllable reasoning. While KAT-V1 outlined the overall framework of **SFT + RL** for adaptive reasoning, this paper provides the **detailed algorithmic design** of that training recipe.
|
| 30 |
|
| 31 |
***
|
| 32 |
|