Update README.md
Browse files
README.md
CHANGED
|
@@ -16,9 +16,7 @@ base_model:
|
|
| 16 |
|
| 17 |
## Introduction
|
| 18 |
|
| 19 |
-
We present a compact yet powerful reasoning model **Ring-mini-2.0**. It has 16B total parameters, with 1.4B parameters are activated per input token (non-embedding 789M). Although **Ring-mini-2.0** is quite compact, it still reaches the top-tier level of sub-10B dense LLMs and even matches or surpasses much larger MoE models
|
| 20 |
-
|
| 21 |
-
through pre-training on 20T tokens of high-quality data and enhanced through long-cot supervised fine-tuning and multi-stage reinforcement learning.
|
| 22 |
|
| 23 |
|
| 24 |
## Model Downloads
|
|
|
|
| 16 |
|
| 17 |
## Introduction
|
| 18 |
|
| 19 |
+
We present a compact yet powerful reasoning model **Ring-mini-2.0**. It has 16B total parameters, with 1.4B parameters are activated per input token (non-embedding 789M). Although **Ring-mini-2.0** is quite compact, it still reaches the top-tier level of sub-10B dense LLMs and even matches or surpasses much larger MoE models, through pre-training on 20T tokens of high-quality data and enhanced through long-cot supervised fine-tuning and multi-stage reinforcement learning.
|
|
|
|
|
|
|
| 20 |
|
| 21 |
|
| 22 |
## Model Downloads
|