LiangJiang commited on
Commit
0b0286f
·
verified ·
1 Parent(s): 66ca3a5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -23
README.md CHANGED
@@ -2,7 +2,7 @@
2
  # Ring-lite
3
 
4
  <p align="center">
5
- <img src="https://huggingface.co/inclusionAI/Ring-lite-distill-preview/resolve/main/ant-bailing.png" width="100"/>
6
  <p>
7
 
8
  <p align="center">
@@ -11,7 +11,7 @@
11
 
12
  ## Introduction
13
 
14
- Ring-lite is an fully open-source MoE LLM provided by InclusionAI, which has 16.8B parameters with 2.75B activated parameters. It was derived from [Ling-lite-1.5](https://huggingface.co/inclusionAI/Ling-lite-1.5) through a training process involving reasoning SFT, reasoning RL and general SFT. This model delivers performance comparable to [Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B) on reasoning benchmarks, while activating only one-third of their parameter. . This demonstrates that Ring-lite-distill is a more balanced and versatile model. Additionaly, it maintains competitive latency and throughput compared to other reasoning LLMs of similar size.
15
 
16
  ## Model Downloads
17
 
@@ -19,34 +19,15 @@ Ring-lite is an fully open-source MoE LLM provided by InclusionAI, which has 16.
19
 
20
  | **Model** | **#Total Params** | **#Activated Params** | **Context Length** | **Download** |
21
  | :----------------: | :---------------: | :-------------------: | :----------------: | :----------: |
22
- | Ring-lite-distill-preview | 16.8B | 2.75B | 64K | [🤗 HuggingFace](https://huggingface.co/inclusionAI/Ring-lite-distill) |
23
 
24
  </div>
25
 
26
  ## Evaluation
27
- In order to fully evaluate the model's performance, we examined Ring-lite-distill-preview in terms of both reasoning ability and general ability.
28
  ### Reasoning ability
29
 
30
- <div align="center">
31
-
32
- | **Model** | **AIME24** | **MATH-500** | **GPQA-diamond** | **LiveCodeBench** |
33
- | :----------------: | :---------------: | :-------------------: | :----------------: | :----------: |
34
- | DeepSeek-R1-Distill-Qwen-7B (reported) | 55.5 | 92.8 | 49.1 | 37.6 |
35
- | DeepSeek-R1-Distill-Qwen-7B (reproduce) | 53.2 | 93.7 | 50.4 | 36.5 |
36
- | Ring-lite-distill-preview | 56.3 | 93.7 | 46.2 | 31.9 |
37
-
38
- </div>
39
-
40
- ### General ability
41
 
42
- <div align="center">
43
-
44
- | **Model** | **IFEval** | **T-eval** | **BFCL_v2** | **MMLU** |
45
- | :----------------: | :---------------: | :-------------------: | :----------------: | :----------: |
46
- | DeepSeek-R1-Distill-Qwen-7B (reproduce) | 39.3 | 26.9 | 38.9 | 44.1 |
47
- | Ring-lite-distill-preview | 75.3 | 81.3 | 63.0 | 63.3 |
48
-
49
- </div>
50
  More details will be reported in our technical report. [TBD]
51
 
52
  ## Quickstart
 
2
  # Ring-lite
3
 
4
  <p align="center">
5
+ <img src="https://huggingface.co/inclusionAI/Ring-lite/blob/main/ant-bailing.png" width="100"/>
6
  <p>
7
 
8
  <p align="center">
 
11
 
12
  ## Introduction
13
 
14
+ Ring-lite is an fully open-source MoE LLM provided by InclusionAI, which has 16.8B parameters with 2.75B activated parameters. It was derived from [Ling-lite-1.5](https://huggingface.co/inclusionAI/Ling-lite-1.5) through a training process involving reasoning SFT, reasoning RL and general SFT. This model delivers performance comparable to [Qwen3-8B](https://huggingface.co/Qwen/Qwen3-8B) on reasoning benchmarks, while activating only one-third of their parameters.
15
 
16
  ## Model Downloads
17
 
 
19
 
20
  | **Model** | **#Total Params** | **#Activated Params** | **Context Length** | **Download** |
21
  | :----------------: | :---------------: | :-------------------: | :----------------: | :----------: |
22
+ | Ring-lite-distill-preview | 16.8B | 2.75B | 64K | [🤗 HuggingFace](https://huggingface.co/inclusionAI/Ring-lite) |
23
 
24
  </div>
25
 
26
  ## Evaluation
27
+ In order to fully evaluate the model's reasoning performance, we examined Ring-lite on several reasoning benchmarks, including MATH-500, AIME-24, AIME-24, Livecodebench, Codeforces and GPQA.
28
  ### Reasoning ability
29
 
 
 
 
 
 
 
 
 
 
 
 
30
 
 
 
 
 
 
 
 
 
31
  More details will be reported in our technical report. [TBD]
32
 
33
  ## Quickstart