File size: 4,379 Bytes
f990cc4
 
 
 
 
 
 
 
3c06a3b
b6580d0
 
50a1e69
b6580d0
 
a1023d6
b6580d0
 
 
 
 
 
 
 
50a1e69
 
b6580d0
 
 
 
 
 
50a1e69
b6580d0
 
 
 
50a1e69
 
b6580d0
 
 
 
 
9b25d2b
 
50a1e69
b6580d0
 
 
50a1e69
b6580d0
50a1e69
dcf1105
50a1e69
 
 
 
dcf1105
50a1e69
2f31ebf
d34921b
b6580d0
 
 
50a1e69
 
b6580d0
 
 
 
50a1e69
b6580d0
 
 
 
 
 
 
 
 
 
18e44b7
b6580d0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
50a1e69
b6580d0
50a1e69
 
b6580d0
 
 
 
 
50a1e69
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
---
license: mit
language:
- zh
- en
base_model:
- inclusionAI/Ling-lite
pipeline_tag: text-generation
library_name: transformers
---

# Ring-lite-distill-preview

<p align="center">
    <img src="https://huggingface.co/inclusionAI/Ring-lite-distill-preview/resolve/main/ant-bailing.png" width="100"/>
<p>

<p align="center">
          🤗 <a href="https://huggingface.co/inclusionAI">Hugging Face</a>
<p>

## Introduction

Ring-lite-distill-preview is an MoE LLM provided and open-sourced by InclusionAI, which has 16.8B parameters with 2.75B activated parameters. It was fine-tuned from [Ling-lite](https://modelscope.cn/models/inclusionAI/Ling-lite) using extensive reasoning-focused instruction data. This model delivers performance comparable to DeepSeek-R1-Distill-Qwen-7B on reasoning benchmarks while achieving better results on general benchmarks, especially superior performance on function-calling evaluation benchmarks (e.g., TEval, BFCl_v2) and instruction-following benchmarks (e.g., IFEval). This demonstrates that Ring-lite-distill is a more balanced and versatile model. Additionaly, it maintains competitive latency and throughput compared to other reasoning LLMs of similar size.

## Model Downloads

<div align="center">

|     **Model**      | **#Total Params** | **#Activated Params** | **Context Length** | **Download** |
| :----------------: | :---------------: | :-------------------: | :----------------: | :----------: |
| Ring-lite-distill-preview |       16.8B       |         2.75B         |        64K         |      [🤗 HuggingFace](https://huggingface.co/inclusionAI/Ring-lite-distill) |

</div>

## Evaluation
In order to fully evaluate the model's performance, we examined Ring-lite-distill-preview in terms of both reasoning ability and general ability.
### Reasoning ability

<div align="center">

|     **Model**      | **AIME24** | **MATH-500** | **GPQA-diamond** | **LiveCodeBench** |
| :----------------: | :---------------: | :-------------------: | :----------------: | :----------: |
| DeepSeek-R1-Distill-Qwen-7B (reported) |       55.5       |         92.8         |        49.1           |          37.6       |
| DeepSeek-R1-Distill-Qwen-7B (reproduce)  |       53.2       |         93.7         |        50.4         |         36.5       |
| Ring-lite-distill-preview |       56.3       |         93.7         |        46.2        |        31.9       |

</div>

### General ability

<div align="center">

|     **Model**      | **IFEval**  | **T-eval** | **BFCL_v2** | **MMLU** |
| :----------------: | :---------------: | :-------------------: | :----------------: | :----------: |
| DeepSeek-R1-Distill-Qwen-7B (reproduce)  |       39.3       |         26.9          | 38.9 | 44.1 |
| Ring-lite-distill-preview |      75.3 | 81.3 | 63.0 | 63.3 |

</div>

More details will be reported in our [technical report](https://github.com/inclusionAI/Ring/blob/main/Ring_Lite_Distill_Preview.pdf).

## Quickstart

### 🤗 Hugging Face Transformers
Here is a code snippet to show you how to use the chat model with `transformers`:

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "inclusionAI/Ring-lite-distill-preview"

model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)

prompt = "Give me a short introduction to large language models."
messages = [
    {"role": "system", "content": "You are Ring, an assistant created by inclusionAI"},
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=8192
)
generated_ids = [
    output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]

response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
```

## Dataset
The training data of Ring-lite-distill-preview will be released soon. 

## Deployment
Please refer to [GitHub](https://github.com/inclusionAI/Ring/blob/main/README.md)

## License
This code repository is licensed under [the MIT License](https://huggingface.co/inclusionAI/Ring-lite-distill/blob/main/LICENSE).

## Citation
[TBD]