BubbleQ commited on
Commit
5c3e89c
·
verified ·
1 Parent(s): 4336d6c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -7
README.md CHANGED
@@ -25,6 +25,7 @@ library_name: transformers
25
 
26
 
27
  `Klear-46B-A2.5B` is a sparse Mixture-of-Experts (MoE) large language model developed by **the Kwai-Klear Team at Kuaishou**, designed to deliver both **high performance** and **inference efficiency**. It features **256 experts**, with only **8 experts and 1 shared expert activated** per layer during the forward pass, resulting in **46 billion total parameters** but just **2.5 billion active** — achieving dense-level performance at a fraction of the computational cost.
 
28
  The model was trained on over **22 trillion tokens** using a **three-stage progressive curriculum**:
29
 
30
  **1. Foundational Knowledge Learning (12T tokens):**
@@ -65,7 +66,7 @@ The base and instruction tuned + DPO models have the following architecture:
65
  | **Model** | **#Total Params** | **#Activated Params** | **Context Length** | **Download Link** |
66
  | :------------: | :------------: | :------------: | :------------: | :------------: |
67
  | Klear-46B-A2.5B-Base | 46B | 2.5B | 64K | [🤗 Hugging Face](https://huggingface.co/Kwai-Klear/Klear-46B-A2.5B-Base) |
68
- | Klear-46B-A2.5B-Inst. | 46B | 2.5B | 64K | [🤗 Hugging Face](https://huggingface.co/Kwai-Klear) |
69
 
70
  </div>
71
 
@@ -99,8 +100,8 @@ Note:
99
  1. `*`During pretraining, we found that the HumanEval metric fluctuated significantly and was extremely sensitive to formatting. Therefore, we referred to the prompt from Ling-series paper to modify the original HumanEval. The results in the table are the evaluation metrics after this modification.
100
  2. For Mimo-base-7B, the results marked with `*` are sourced from their public report, other evaluations are conducted based on internal evaluation frameworks.
101
 
102
- ### Klear-46B-A2.5B-Inst. Evaluation Results
103
- | Ability | Benchmark | Klear-46B-A2.5B | InternLM3-8B-Instruct | MiniCPM4-8B | Qwen3-8B (NoThink) | gemma3-12b-it | Phi4-14B | Qwen3-30B-A3B-2507 |
104
  | ------------- | --------------------------- | --------------- | --------------------- | ----------- | ------------------ | ------------- | -------- | ------------------ |
105
  | | # Total Params | 46B | 8B | 8B | 8B | 12B | 14B | 30B |
106
  | | # Activated Params | 2.5B | 8B | 8B | 8B | 12B | 14B | 3B |
@@ -155,13 +156,13 @@ result = tokenizer.decode(outputs[0], skip_special_tokens=True)
155
  print(result)
156
  ```
157
 
158
- #### Klear-46B-A2.5B-Inst.
159
 
160
  ```python
161
  import torch
162
  from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
163
 
164
- model_path = "/path/to/Klear-Inst."
165
  tokenizer = AutoTokenizer.from_pretrained(model_path)
166
 
167
  model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto", dtype=torch.bfloat16, trust_remote_code=True)
@@ -184,7 +185,7 @@ print(result)
184
  git clone https://github.com/Kwai-Klear/vllm.git
185
  cd vllm
186
  VLLM_USE_PRECOMPILED=1 pip install --editable .
187
- vllm serve /path/to/Klear-Inst. --port 8000 --tensor-parallel-size 8 --trust-remote-code
188
  ```
189
 
190
  An OpenAI-compatible API will be available at `http://localhost:8000/v1`.
@@ -194,7 +195,7 @@ Or you can refer to the following Python script for offline inference
194
  from vllm import LLM, SamplingParams
195
  from transformers import AutoTokenizer
196
 
197
- model_path = "/path/to/Klear-Inst."
198
  tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
199
 
200
  llm = LLM(
 
25
 
26
 
27
  `Klear-46B-A2.5B` is a sparse Mixture-of-Experts (MoE) large language model developed by **the Kwai-Klear Team at Kuaishou**, designed to deliver both **high performance** and **inference efficiency**. It features **256 experts**, with only **8 experts and 1 shared expert activated** per layer during the forward pass, resulting in **46 billion total parameters** but just **2.5 billion active** — achieving dense-level performance at a fraction of the computational cost.
28
+
29
  The model was trained on over **22 trillion tokens** using a **three-stage progressive curriculum**:
30
 
31
  **1. Foundational Knowledge Learning (12T tokens):**
 
66
  | **Model** | **#Total Params** | **#Activated Params** | **Context Length** | **Download Link** |
67
  | :------------: | :------------: | :------------: | :------------: | :------------: |
68
  | Klear-46B-A2.5B-Base | 46B | 2.5B | 64K | [🤗 Hugging Face](https://huggingface.co/Kwai-Klear/Klear-46B-A2.5B-Base) |
69
+ | Klear-46B-A2.5B-Instruct | 46B | 2.5B | 64K | [🤗 Hugging Face](https://huggingface.co/Kwai-Klear) |
70
 
71
  </div>
72
 
 
100
  1. `*`During pretraining, we found that the HumanEval metric fluctuated significantly and was extremely sensitive to formatting. Therefore, we referred to the prompt from Ling-series paper to modify the original HumanEval. The results in the table are the evaluation metrics after this modification.
101
  2. For Mimo-base-7B, the results marked with `*` are sourced from their public report, other evaluations are conducted based on internal evaluation frameworks.
102
 
103
+ ### Klear-46B-A2.5B-Instruct Evaluation Results
104
+ | Ability | Benchmark | Klear-46B-A2.5B-Instruct | InternLM3-8B-Instruct | MiniCPM4-8B | Qwen3-8B (NoThink) | gemma3-12b-it | Phi4-14B | Qwen3-30B-A3B-2507 |
105
  | ------------- | --------------------------- | --------------- | --------------------- | ----------- | ------------------ | ------------- | -------- | ------------------ |
106
  | | # Total Params | 46B | 8B | 8B | 8B | 12B | 14B | 30B |
107
  | | # Activated Params | 2.5B | 8B | 8B | 8B | 12B | 14B | 3B |
 
156
  print(result)
157
  ```
158
 
159
+ #### Klear-46B-A2.5B-Instruct
160
 
161
  ```python
162
  import torch
163
  from transformers import AutoTokenizer, AutoModelForCausalLM, GenerationConfig
164
 
165
+ model_path = "/path/to/Klear-Instruct"
166
  tokenizer = AutoTokenizer.from_pretrained(model_path)
167
 
168
  model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto", dtype=torch.bfloat16, trust_remote_code=True)
 
185
  git clone https://github.com/Kwai-Klear/vllm.git
186
  cd vllm
187
  VLLM_USE_PRECOMPILED=1 pip install --editable .
188
+ vllm serve /path/to/Klear-Instruct --port 8000 --tensor-parallel-size 8 --trust-remote-code
189
  ```
190
 
191
  An OpenAI-compatible API will be available at `http://localhost:8000/v1`.
 
195
  from vllm import LLM, SamplingParams
196
  from transformers import AutoTokenizer
197
 
198
+ model_path = "/path/to/Klear-Instruct"
199
  tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
200
 
201
  llm = LLM(