Update README.md
Browse files
README.md
CHANGED
|
@@ -2,7 +2,7 @@
|
|
| 2 |
# Ring-lite
|
| 3 |
|
| 4 |
<p align="center">
|
| 5 |
-
<img src="https://huggingface.co/inclusionAI/Ring-lite
|
| 6 |
<p>
|
| 7 |
|
| 8 |
<p align="center">
|
|
@@ -11,7 +11,7 @@
|
|
| 11 |
|
| 12 |
## Introduction
|
| 13 |
|
| 14 |
-
Ring-lite is an fully open-source MoE LLM provided by InclusionAI, which has 16.8B parameters with 2.75B activated parameters. It was derived from [Ling-lite-1.5](https://huggingface.co/inclusionAI/Ling-lite-1.5) through a training process involving reasoning SFT, reasoning RL and general SFT. This model delivers performance comparable to [Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B) on reasoning benchmarks, while activating only one-third of their
|
| 15 |
|
| 16 |
## Model Downloads
|
| 17 |
|
|
@@ -19,34 +19,15 @@ Ring-lite is an fully open-source MoE LLM provided by InclusionAI, which has 16.
|
|
| 19 |
|
| 20 |
| **Model** | **#Total Params** | **#Activated Params** | **Context Length** | **Download** |
|
| 21 |
| :----------------: | :---------------: | :-------------------: | :----------------: | :----------: |
|
| 22 |
-
| Ring-lite-distill-preview | 16.8B | 2.75B | 64K | [🤗 HuggingFace](https://huggingface.co/inclusionAI/Ring-lite
|
| 23 |
|
| 24 |
</div>
|
| 25 |
|
| 26 |
## Evaluation
|
| 27 |
-
In order to fully evaluate the model's performance, we examined Ring-lite
|
| 28 |
### Reasoning ability
|
| 29 |
|
| 30 |
-
<div align="center">
|
| 31 |
-
|
| 32 |
-
| **Model** | **AIME24** | **MATH-500** | **GPQA-diamond** | **LiveCodeBench** |
|
| 33 |
-
| :----------------: | :---------------: | :-------------------: | :----------------: | :----------: |
|
| 34 |
-
| DeepSeek-R1-Distill-Qwen-7B (reported) | 55.5 | 92.8 | 49.1 | 37.6 |
|
| 35 |
-
| DeepSeek-R1-Distill-Qwen-7B (reproduce) | 53.2 | 93.7 | 50.4 | 36.5 |
|
| 36 |
-
| Ring-lite-distill-preview | 56.3 | 93.7 | 46.2 | 31.9 |
|
| 37 |
-
|
| 38 |
-
</div>
|
| 39 |
-
|
| 40 |
-
### General ability
|
| 41 |
|
| 42 |
-
<div align="center">
|
| 43 |
-
|
| 44 |
-
| **Model** | **IFEval** | **T-eval** | **BFCL_v2** | **MMLU** |
|
| 45 |
-
| :----------------: | :---------------: | :-------------------: | :----------------: | :----------: |
|
| 46 |
-
| DeepSeek-R1-Distill-Qwen-7B (reproduce) | 39.3 | 26.9 | 38.9 | 44.1 |
|
| 47 |
-
| Ring-lite-distill-preview | 75.3 | 81.3 | 63.0 | 63.3 |
|
| 48 |
-
|
| 49 |
-
</div>
|
| 50 |
More details will be reported in our technical report. [TBD]
|
| 51 |
|
| 52 |
## Quickstart
|
|
|
|
| 2 |
# Ring-lite
|
| 3 |
|
| 4 |
<p align="center">
|
| 5 |
+
<img src="https://huggingface.co/inclusionAI/Ring-lite/blob/main/ant-bailing.png" width="100"/>
|
| 6 |
<p>
|
| 7 |
|
| 8 |
<p align="center">
|
|
|
|
| 11 |
|
| 12 |
## Introduction
|
| 13 |
|
| 14 |
+
Ring-lite is an fully open-source MoE LLM provided by InclusionAI, which has 16.8B parameters with 2.75B activated parameters. It was derived from [Ling-lite-1.5](https://huggingface.co/inclusionAI/Ling-lite-1.5) through a training process involving reasoning SFT, reasoning RL and general SFT. This model delivers performance comparable to [Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B) on reasoning benchmarks, while activating only one-third of their parameters.
|
| 15 |
|
| 16 |
## Model Downloads
|
| 17 |
|
|
|
|
| 19 |
|
| 20 |
| **Model** | **#Total Params** | **#Activated Params** | **Context Length** | **Download** |
|
| 21 |
| :----------------: | :---------------: | :-------------------: | :----------------: | :----------: |
|
| 22 |
+
| Ring-lite-distill-preview | 16.8B | 2.75B | 64K | [🤗 HuggingFace](https://huggingface.co/inclusionAI/Ring-lite) |
|
| 23 |
|
| 24 |
</div>
|
| 25 |
|
| 26 |
## Evaluation
|
| 27 |
+
In order to fully evaluate the model's reasoning performance, we examined Ring-lite on several reasoning benchmarks, including MATH-500, AIME-24, AIME-24, Livecodebench, Codeforces and GPQA.
|
| 28 |
### Reasoning ability
|
| 29 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 30 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 31 |
More details will be reported in our technical report. [TBD]
|
| 32 |
|
| 33 |
## Quickstart
|