Update README.md
Browse files
README.md
CHANGED
|
@@ -14,7 +14,7 @@ library_name: transformers
|
|
| 14 |
|
| 15 |
## FastCuRL Overview
|
| 16 |
|
| 17 |
-
We release **FastCuRL-1.5B-Preview**, a slow-thinking reasoning model that
|
| 18 |
|
| 19 |
Code: https://github.com/nick7nlp/FastCuRL
|
| 20 |
|
|
|
|
| 14 |
|
| 15 |
## FastCuRL Overview
|
| 16 |
|
| 17 |
+
We release **FastCuRL-1.5B-Preview**, a slow-thinking reasoning model that **outperforms** the previous SoTA *DeepScaleR-1.5B-Preview* with **50% training steps**! We adapt a novel curriculum-guided iterative lengthening reinforcement learning to the *DeepSeek-R1-Distill-Qwen-1.5B* and observe continuous performance improvement as training steps increase. To better reproduce our work and advance research progress, we open-source our code, model, and data.
|
| 18 |
|
| 19 |
Code: https://github.com/nick7nlp/FastCuRL
|
| 20 |
|