jobs-git zR commited on
Commit
2010b19
·
verified ·
0 Parent(s):

Duplicate from zai-org/GLM-4.5

Browse files

Co-authored-by: zR <ZAHNGYUXUAN@users.noreply.huggingface.co>

This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. .gitattributes +36 -0
  2. README.md +186 -0
  3. chat_template.jinja +103 -0
  4. config.json +43 -0
  5. generation_config.json +10 -0
  6. model-00001-of-00093.safetensors +3 -0
  7. model-00002-of-00093.safetensors +3 -0
  8. model-00003-of-00093.safetensors +3 -0
  9. model-00004-of-00093.safetensors +3 -0
  10. model-00005-of-00093.safetensors +3 -0
  11. model-00006-of-00093.safetensors +3 -0
  12. model-00007-of-00093.safetensors +3 -0
  13. model-00008-of-00093.safetensors +3 -0
  14. model-00009-of-00093.safetensors +3 -0
  15. model-00010-of-00093.safetensors +3 -0
  16. model-00011-of-00093.safetensors +3 -0
  17. model-00012-of-00093.safetensors +3 -0
  18. model-00013-of-00093.safetensors +3 -0
  19. model-00014-of-00093.safetensors +3 -0
  20. model-00015-of-00093.safetensors +3 -0
  21. model-00016-of-00093.safetensors +3 -0
  22. model-00017-of-00093.safetensors +3 -0
  23. model-00018-of-00093.safetensors +3 -0
  24. model-00019-of-00093.safetensors +3 -0
  25. model-00020-of-00093.safetensors +3 -0
  26. model-00021-of-00093.safetensors +3 -0
  27. model-00022-of-00093.safetensors +3 -0
  28. model-00023-of-00093.safetensors +3 -0
  29. model-00024-of-00093.safetensors +3 -0
  30. model-00025-of-00093.safetensors +3 -0
  31. model-00026-of-00093.safetensors +3 -0
  32. model-00027-of-00093.safetensors +3 -0
  33. model-00028-of-00093.safetensors +3 -0
  34. model-00029-of-00093.safetensors +3 -0
  35. model-00030-of-00093.safetensors +3 -0
  36. model-00031-of-00093.safetensors +3 -0
  37. model-00032-of-00093.safetensors +3 -0
  38. model-00033-of-00093.safetensors +3 -0
  39. model-00034-of-00093.safetensors +3 -0
  40. model-00035-of-00093.safetensors +3 -0
  41. model-00036-of-00093.safetensors +3 -0
  42. model-00037-of-00093.safetensors +3 -0
  43. model-00038-of-00093.safetensors +3 -0
  44. model-00039-of-00093.safetensors +3 -0
  45. model-00040-of-00093.safetensors +3 -0
  46. model-00041-of-00093.safetensors +3 -0
  47. model-00042-of-00093.safetensors +3 -0
  48. model-00043-of-00093.safetensors +3 -0
  49. model-00044-of-00093.safetensors +3 -0
  50. model-00045-of-00093.safetensors +3 -0
.gitattributes ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,186 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ - zh
5
+ library_name: transformers
6
+ license: mit
7
+ pipeline_tag: text-generation
8
+ ---
9
+
10
+ # GLM-4.5
11
+
12
+ <div align="center">
13
+ <img src=https://raw.githubusercontent.com/zai-org/GLM-4.5/refs/heads/main/resources/logo.svg width="15%"/>
14
+ </div>
15
+ <p align="center">
16
+ 👋 Join our <a href="https://discord.gg/QR7SARHRxK" target="_blank">Discord</a> community.
17
+ <br>
18
+ 📖 Check out the GLM-4.5 <a href="https://z.ai/blog/glm-4.5" target="_blank">technical blog</a>, <a href="https://arxiv.org/abs/2508.06471" target="_blank">technical report</a>, and <a href="https://zhipu-ai.feishu.cn/wiki/Gv3swM0Yci7w7Zke9E0crhU7n7D" target="_blank">Zhipu AI technical documentation</a>.
19
+ <br>
20
+ 📍 Use GLM-4.5 API services on <a href="https://docs.z.ai/guides/llm/glm-4.5">Z.ai API Platform (Global)</a> or <br> <a href="https://docs.bigmodel.cn/cn/guide/models/text/glm-4.5">Zhipu AI Open Platform (Mainland China)</a>.
21
+ <br>
22
+ 👉 One click to <a href="https://chat.z.ai">GLM-4.5</a>.
23
+ </p>
24
+
25
+ ## Model Introduction
26
+
27
+ The **GLM-4.5** series models are foundation models designed for intelligent agents. GLM-4.5 has **355** billion total parameters with **32** billion active parameters, while GLM-4.5-Air adopts a more compact design with **106** billion total parameters and **12** billion active parameters. GLM-4.5 models unify reasoning, coding, and intelligent agent capabilities to meet the complex demands of intelligent agent applications.
28
+
29
+ Both GLM-4.5 and GLM-4.5-Air are hybrid reasoning models that provide two modes: thinking mode for complex reasoning and tool usage, and non-thinking mode for immediate responses.
30
+
31
+ We have open-sourced the base models, hybrid reasoning models, and FP8 versions of the hybrid reasoning models for both GLM-4.5 and GLM-4.5-Air. They are released under the MIT open-source license and can be used commercially and for secondary development.
32
+
33
+ As demonstrated in our comprehensive evaluation across 12 industry-standard benchmarks, GLM-4.5 achieves exceptional performance with a score of **63.2**, in the **3rd** place among all the proprietary and open-source models. Notably, GLM-4.5-Air delivers competitive results at **59.8** while maintaining superior efficiency.
34
+
35
+ ![bench](https://raw.githubusercontent.com/zai-org/GLM-4.5/refs/heads/main/resources/bench.png)
36
+
37
+ For more eval results, show cases, and technical details, please visit
38
+ our [technical blog](https://z.ai/blog/glm-4.5) or [technical report](https://arxiv.org/abs/2508.06471).
39
+
40
+ The model code, tool parser and reasoning parser can be found in the implementation of [transformers](https://github.com/huggingface/transformers/tree/main/src/transformers/models/glm4_moe), [vLLM](https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/models/glm4_moe_mtp.py) and [SGLang](https://github.com/sgl-project/sglang/blob/main/python/sglang/srt/models/glm4_moe.py).
41
+
42
+ ## Model Downloads
43
+
44
+ You can directly experience the model on [Hugging Face](https://huggingface.co/spaces/zai-org/GLM-4.5-Space)
45
+ or [ModelScope](https://modelscope.cn/studios/ZhipuAI/GLM-4.5-Demo) or download the model by following the links below.
46
+
47
+ | Model | Download Links | Model Size | Precision |
48
+ |------------------|-----------------------------------------------------------------------------------------------------------------------------------------------|------------|-----------|
49
+ | GLM-4.5 | [🤗 Hugging Face](https://huggingface.co/zai-org/GLM-4.5)<br> [🤖 ModelScope](https://modelscope.cn/models/ZhipuAI/GLM-4.5) | 355B-A32B | BF16 |
50
+ | GLM-4.5-Air | [🤗 Hugging Face](https://huggingface.co/zai-org/GLM-4.5-Air)<br> [🤖 ModelScope](https://modelscope.cn/models/ZhipuAI/GLM-4.5-Air) | 106B-A12B | BF16 |
51
+ | GLM-4.5-FP8 | [🤗 Hugging Face](https://huggingface.co/zai-org/GLM-4.5-FP8)<br> [🤖 ModelScope](https://modelscope.cn/models/ZhipuAI/GLM-4.5-FP8) | 355B-A32B | FP8 |
52
+ | GLM-4.5-Air-FP8 | [🤗 Hugging Face](https://huggingface.co/zai-org/GLM-4.5-Air-FP8)<br> [🤖 ModelScope](https://modelscope.cn/models/ZhipuAI/GLM-4.5-Air-FP8) | 106B-A12B | FP8 |
53
+ | GLM-4.5-Base | [🤗 Hugging Face](https://huggingface.co/zai-org/GLM-4.5-Base)<br> [🤖 ModelScope](https://modelscope.cn/models/ZhipuAI/GLM-4.5-Base) | 355B-A32B | BF16 |
54
+ | GLM-4.5-Air-Base | [🤗 Hugging Face](https://huggingface.co/zai-org/GLM-4.5-Air-Base)<br> [🤖 ModelScope](https://modelscope.cn/models/ZhipuAI/GLM-4.5-Air-Base) | 106B-A12B | BF16 |
55
+
56
+ ## System Requirements
57
+
58
+ ### Inference
59
+
60
+ We provide minimum and recommended configurations for "full-featured" model inference. The data in the table below is
61
+ based on the following conditions:
62
+
63
+ 1. All models use MTP layers and specify
64
+ `--speculative-num-steps 3 --speculative-eagle-topk 1 --speculative-num-draft-tokens 4` to ensure competitive
65
+ inference speed.
66
+ 2. The `cpu-offload` parameter is not used.
67
+ 3. Inference batch size does not exceed `8`.
68
+ 4. All are executed on devices that natively support FP8 inference, ensuring both weights and cache are in FP8 format.
69
+ 5. Server memory must exceed `1T` to ensure normal model loading and operation.
70
+
71
+ The models can run under the configurations in the table below:
72
+
73
+ | Model | Precision | GPU Type and Count | Test Framework |
74
+ |-------------|-----------|----------------------|----------------|
75
+ | GLM-4.5 | BF16 | H100 x 16 / H200 x 8 | sglang |
76
+ | GLM-4.5 | FP8 | H100 x 8 / H200 x 4 | sglang |
77
+ | GLM-4.5-Air | BF16 | H100 x 4 / H200 x 2 | sglang |
78
+ | GLM-4.5-Air | FP8 | H100 x 2 / H200 x 1 | sglang |
79
+
80
+ Under the configurations in the table below, the models can utilize their full 128K context length:
81
+
82
+ | Model | Precision | GPU Type and Count | Test Framework |
83
+ |-------------|-----------|-----------------------|----------------|
84
+ | GLM-4.5 | BF16 | H100 x 32 / H200 x 16 | sglang |
85
+ | GLM-4.5 | FP8 | H100 x 16 / H200 x 8 | sglang |
86
+ | GLM-4.5-Air | BF16 | H100 x 8 / H200 x 4 | sglang |
87
+ | GLM-4.5-Air | FP8 | H100 x 4 / H200 x 2 | sglang |
88
+
89
+ ### Fine-tuning
90
+
91
+ The code can run under the configurations in the table below
92
+ using [Llama Factory](https://github.com/hiyouga/LLaMA-Factory):
93
+
94
+ | Model | GPU Type and Count | Strategy | Batch Size (per GPU) |
95
+ |-------------|--------------------|----------|----------------------|
96
+ | GLM-4.5 | H100 x 16 | Lora | 1 |
97
+ | GLM-4.5-Air | H100 x 4 | Lora | 1 |
98
+
99
+ The code can run under the configurations in the table below using [Swift](https://github.com/modelscope/ms-swift):
100
+
101
+ | Model | GPU Type and Count | Strategy | Batch Size (per GPU) |
102
+ |-------------|--------------------|----------|----------------------|
103
+ | GLM-4.5 | H20 (96GiB) x 16 | Lora | 1 |
104
+ | GLM-4.5-Air | H20 (96GiB) x 4 | Lora | 1 |
105
+ | GLM-4.5 | H20 (96GiB) x 128 | SFT | 1 |
106
+ | GLM-4.5-Air | H20 (96GiB) x 32 | SFT | 1 |
107
+ | GLM-4.5 | H20 (96GiB) x 128 | RL | 1 |
108
+ | GLM-4.5-Air | H20 (96GiB) x 32 | RL | 1 |
109
+
110
+ ## Quick Start
111
+
112
+ Please install the required packages according to `requirements.txt`.
113
+
114
+ ```shell
115
+ pip install -r requirements.txt
116
+ ```
117
+
118
+ ### transformers
119
+
120
+ Please refer to the `trans_infer_cli.py` code in the `inference` folder.
121
+
122
+ ### vLLM
123
+
124
+ + Both BF16 and FP8 can be started with the following code:
125
+
126
+ ```shell
127
+ vllm serve zai-org/GLM-4.5-Air \
128
+ --tensor-parallel-size 8 \
129
+ --tool-call-parser glm45 \
130
+ --reasoning-parser glm45 \
131
+ --enable-auto-tool-choice \
132
+ --served-model-name glm-4.5-air
133
+ ```
134
+
135
+ If you're using 8x H100 GPUs and encounter insufficient memory when running the GLM-4.5 model, you'll need
136
+ `--cpu-offload-gb 16` (only applicable to vLLM).
137
+
138
+ If you encounter `flash infer` issues, use `VLLM_ATTENTION_BACKEND=XFORMERS` as a temporary replacement. You can also
139
+ specify `TORCH_CUDA_ARCH_LIST='9.0+PTX'` to use `flash infer` (different GPUs have different TORCH_CUDA_ARCH_LIST
140
+ values, please check accordingly).
141
+
142
+ ### SGLang
143
+
144
+ + BF16
145
+
146
+ ```shell
147
+ python3 -m sglang.launch_server \
148
+ --model-path zai-org/GLM-4.5-Air \
149
+ --tp-size 8 \
150
+ --tool-call-parser glm45 \
151
+ --reasoning-parser glm45 \
152
+ --speculative-algorithm EAGLE \
153
+ --speculative-num-steps 3 \
154
+ --speculative-eagle-topk 1 \
155
+ --speculative-num-draft-tokens 4 \
156
+ --mem-fraction-static 0.7 \
157
+ --served-model-name glm-4.5-air \
158
+ --host 0.0.0.0 \
159
+ --port 8000
160
+ ```
161
+
162
+ + FP8
163
+
164
+ ```shell
165
+ python3 -m sglang.launch_server \
166
+ --model-path zai-org/GLM-4.5-Air-FP8 \
167
+ --tp-size 4 \
168
+ --tool-call-parser glm45 \
169
+ --reasoning-parser glm45 \
170
+ --speculative-algorithm EAGLE \
171
+ --speculative-num-steps 3 \
172
+ --speculative-eagle-topk 1 \
173
+ --speculative-num-draft-tokens 4 \
174
+ --mem-fraction-static 0.7 \
175
+ --disable-shared-experts-fusion \
176
+ --served-model-name glm-4.5-air-fp8 \
177
+ --host 0.0.0.0 \
178
+ --port 8000
179
+ ```
180
+
181
+ ### Request Parameter Instructions
182
+
183
+ + When using `vLLM` and `SGLang`, thinking mode is enabled by default when sending requests. If you want to disable the
184
+ thinking switch, you need to add the `extra_body={"chat_template_kwargs": {"enable_thinking": False}}` parameter.
185
+ + Both support tool calling. Please use OpenAI-style tool description format for calls.
186
+ + For specific code, please refer to `api_request.py` in the `inference` folder.
chat_template.jinja ADDED
@@ -0,0 +1,103 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [gMASK]<sop>
2
+ {%- if tools -%}
3
+ <|system|>
4
+ # Tools
5
+
6
+ You may call one or more functions to assist with the user query.
7
+
8
+ You are provided with function signatures within <tools></tools> XML tags:
9
+ <tools>
10
+ {% for tool in tools %}
11
+ {{ tool | tojson(ensure_ascii=False) }}
12
+ {% endfor %}
13
+ </tools>
14
+
15
+ For each function call, output the function name and arguments within the following XML format:
16
+ <tool_call>{function-name}
17
+ <arg_key>{arg-key-1}</arg_key>
18
+ <arg_value>{arg-value-1}</arg_value>
19
+ <arg_key>{arg-key-2}</arg_key>
20
+ <arg_value>{arg-value-2}</arg_value>
21
+ ...
22
+ </tool_call>{%- endif -%}
23
+ {%- macro visible_text(content) -%}
24
+ {%- if content is string -%}
25
+ {{- content }}
26
+ {%- elif content is iterable and content is not mapping -%}
27
+ {%- for item in content -%}
28
+ {%- if item is mapping and item.type == 'text' -%}
29
+ {{- item.text }}
30
+ {%- elif item is string -%}
31
+ {{- item }}
32
+ {%- endif -%}
33
+ {%- endfor -%}
34
+ {%- else -%}
35
+ {{- content }}
36
+ {%- endif -%}
37
+ {%- endmacro -%}
38
+ {%- set ns = namespace(last_user_index=-1) %}
39
+ {%- for m in messages %}
40
+ {%- if m.role == 'user' %}
41
+ {% set ns.last_user_index = loop.index0 -%}
42
+ {%- endif %}
43
+ {%- endfor %}
44
+ {% for m in messages %}
45
+ {%- if m.role == 'user' -%}<|user|>
46
+ {{ visible_text(m.content) }}
47
+ {{- '/nothink' if (enable_thinking is defined and not enable_thinking and not visible_text(m.content).endswith("/nothink")) else '' -}}
48
+ {%- elif m.role == 'assistant' -%}
49
+ <|assistant|>
50
+ {%- set reasoning_content = '' %}
51
+ {%- set content = visible_text(m.content) %}
52
+ {%- if m.reasoning_content is string %}
53
+ {%- set reasoning_content = m.reasoning_content %}
54
+ {%- else %}
55
+ {%- if '</think>' in content %}
56
+ {%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
57
+ {%- set content = content.split('</think>')[-1].lstrip('\n') %}
58
+ {%- endif %}
59
+ {%- endif %}
60
+ {%- if loop.index0 > ns.last_user_index and reasoning_content -%}
61
+ {{ '\n<think>' + reasoning_content.strip() + '</think>'}}
62
+ {%- else -%}
63
+ {{ '\n<think></think>' }}
64
+ {%- endif -%}
65
+ {%- if content.strip() -%}
66
+ {{ '\n' + content.strip() }}
67
+ {%- endif -%}
68
+ {% if m.tool_calls %}
69
+ {% for tc in m.tool_calls %}
70
+ {%- if tc.function %}
71
+ {%- set tc = tc.function %}
72
+ {%- endif %}
73
+ {{ '\n<tool_call>' + tc.name }}
74
+ {% set _args = tc.arguments %}
75
+ {% for k, v in _args.items() %}
76
+ <arg_key>{{ k }}</arg_key>
77
+ <arg_value>{{ v | tojson(ensure_ascii=False) if v is not string else v }}</arg_value>
78
+ {% endfor %}
79
+ </tool_call>{% endfor %}
80
+ {% endif %}
81
+ {%- elif m.role == 'tool' -%}
82
+ {%- if m.content is string -%}
83
+ {%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
84
+ {{- '<|observation|>' }}
85
+ {%- endif %}
86
+ {{- '\n<tool_response>\n' }}
87
+ {{- m.content }}
88
+ {{- '\n</tool_response>' }}
89
+ {%- else -%}
90
+ <|observation|>{% for tr in m.content %}
91
+
92
+ <tool_response>
93
+ {{ tr.output if tr.output is defined else tr }}
94
+ </tool_response>{% endfor -%}
95
+ {% endif -%}
96
+ {%- elif m.role == 'system' -%}
97
+ <|system|>
98
+ {{ visible_text(m.content) }}
99
+ {%- endif -%}
100
+ {%- endfor -%}
101
+ {%- if add_generation_prompt -%}
102
+ <|assistant|>{{- '\n<think></think>' if (enable_thinking is defined and not enable_thinking) else '' -}}
103
+ {%- endif -%}
config.json ADDED
@@ -0,0 +1,43 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "Glm4MoeForCausalLM"
4
+ ],
5
+ "attention_bias": true,
6
+ "attention_dropout": 0.0,
7
+ "pad_token_id": 151329,
8
+ "eos_token_id": [
9
+ 151329,
10
+ 151336,
11
+ 151338
12
+ ],
13
+ "head_dim": 128,
14
+ "hidden_act": "silu",
15
+ "hidden_size": 5120,
16
+ "partial_rotary_factor": 0.5,
17
+ "initializer_range": 0.02,
18
+ "intermediate_size": 12288,
19
+ "max_position_embeddings": 131072,
20
+ "model_type": "glm4_moe",
21
+ "moe_intermediate_size": 1536,
22
+ "norm_topk_prob": true,
23
+ "num_attention_heads": 96,
24
+ "n_group": 1,
25
+ "topk_group": 1,
26
+ "n_routed_experts": 160,
27
+ "n_shared_experts": 1,
28
+ "routed_scaling_factor": 2.5,
29
+ "num_experts_per_tok": 8,
30
+ "first_k_dense_replace": 3,
31
+ "num_hidden_layers": 92,
32
+ "num_key_value_heads": 8,
33
+ "rms_norm_eps": 1e-05,
34
+ "rope_scaling": null,
35
+ "rope_theta": 1000000,
36
+ "num_nextn_predict_layers": 1,
37
+ "tie_word_embeddings": false,
38
+ "torch_dtype": "bfloat16",
39
+ "transformers_version": "4.54.0",
40
+ "use_cache": true,
41
+ "use_qk_norm": true,
42
+ "vocab_size": 151552
43
+ }
generation_config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "eos_token_id": [
4
+ 151329,
5
+ 151336,
6
+ 151338
7
+ ],
8
+ "pad_token_id": 151329,
9
+ "transformers_version": "4.54.0"
10
+ }
model-00001-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c3ec0ed85ee95cae5c9b76a7af7272a23169b81705c177c0212f65b24d4b8b41
3
+ size 3753953568
model-00002-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8c5ed321e28aa0c3d31c1b9a9ac0e5666d915ffd89dff216a34427b33fb03c55
3
+ size 650168352
model-00003-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c93996707aea7dc24a24e97eb72a57fbfd2d197a3cab970adf334ed28ae67abb
3
+ size 650168352
model-00004-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c76581125a03a274386d8d0cd0e2c84242135bd0b728c33a3b311e1967b53ab7
3
+ size 7871313120
model-00005-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:72cdb466003372faf78d647ffec247c33f0f2346177147ecce95e8e50b7b678a
3
+ size 7871313120
model-00006-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e643fc9c4b55c9861a965a744b85ae54453275d7a4dfdfad0ca7a9517ad18011
3
+ size 7871313120
model-00007-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6145a87fc2f1e30123293802181b1c684ac710cdd6c929b2dab145154bbd9b7c
3
+ size 7871313120
model-00008-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:22c5cfd2f82adabf3db579a5ff217506d0f5e1b7211492ef73b2209bb138f41f
3
+ size 7871313120
model-00009-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:82e37371e81407fd7b5cd543532c6bdca2bf865a2ee567e8f731777b19a9bd43
3
+ size 7871313120
model-00010-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:afa2a54871e04125363468a6ca024de373ab38edb6f58b657913c2cd17ce5764
3
+ size 7871313120
model-00011-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:049c06bc5a93318dcf91a7c60109c08e8df95a9a5130196a683434d9ede437f8
3
+ size 7871313616
model-00012-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fa88ebbabdc43302cf19ac2d81b27f2b16a3154fdd9ed096040615fa65013805
3
+ size 7871313616
model-00013-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b5a222612fd47c6ca8166275e66e723819baa3068eee2e1694f326066ac67ce4
3
+ size 7871313616
model-00014-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9b8d2da26be63fe9764a4a4cc6a00e58f406aeeda404c31b61612e5e74f9caf8
3
+ size 7871313616
model-00015-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1f832318ea861a4841a3ce0886eaa0ca7dee9729f74b5141634d7d7e9d6f2ba9
3
+ size 7871313616
model-00016-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d344913210c96c6695341094e7da2817ccb94f3946ee9698a76eaaaaf60c1b44
3
+ size 7871313616
model-00017-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:31a4ce3054f3670fd5a7d1e71f502318e65ebb51ee629bd5eccd98adc736d4f6
3
+ size 7871313616
model-00018-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:006012f1bf7dd33271f12626000418c43e127b56bc70f3a5e314656ad717f424
3
+ size 7871313616
model-00019-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:067349f582d85e7bd9b0326b82ee32cede246cf359b4e0b4c6cfe3f95caf8e21
3
+ size 7871313616
model-00020-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c21f395f920d50618e06b473f64971789746d86f3e20e2eab0483562ced51b2a
3
+ size 7871313616
model-00021-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3bde74ebffd96f550117c7b297198eadf75fb8c99603b6e78fd6d6bea62a8da8
3
+ size 7871313616
model-00022-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7f9520583b0947a4bc9e3b22ee93dac2f19256f4f889ae17a5239e4275528db8
3
+ size 7871313616
model-00023-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:16613154e9ab024458f9acd079d7345db4a05593c5a596176640d68053665753
3
+ size 7871313616
model-00024-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b08d17d951a625b2cac46fcf15d7e4b6a99a28f91befabeb7d361fa7cac01e7b
3
+ size 7871313616
model-00025-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dd5cbe90ff83f0b2b91c92d5ba2e100263e8306ce3f6381bdd54051a6a8b6c3d
3
+ size 7871313616
model-00026-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1ea400d62257ef3f0412c8496c955a9ec7052fa4761dfbf4b35905aa04e73318
3
+ size 7871313616
model-00027-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:012d00adf402e18bb9a900d5dc549c2c45f047eb486ef2a44517a8e532dd9f2b
3
+ size 7871313616
model-00028-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:63de598020c34be07e77fb82319760bd539da4adfc48eecf83f64cc85ac35196
3
+ size 7871313616
model-00029-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3ebe5474925c2edf5dea140224c9d506147f44c5accded8a854edb7538e88d3a
3
+ size 7871313616
model-00030-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:272627d9996b600a934e39c285914c04949e9e9ecbde8ddc2cdc228af5a0be42
3
+ size 7871313616
model-00031-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:50b8a452aa3db68fa81daa96d04af6b68aaaf2d9565e2b65ce427fbf821d6573
3
+ size 7871313616
model-00032-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:12bc051e0ed087b3bcfaba48ecc5a2dab369ec95d0bc43b779daec146b9ab45b
3
+ size 7871313616
model-00033-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d506e1a1ac6faea9b46ece2be12d1d83d3e6987d8c836b0deebdb77a96baeac0
3
+ size 7871313616
model-00034-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:22b67f3dfb65d866d6dadeb3b33709641bdf93b34e5cd50ab480f08ca2fe1ab2
3
+ size 7871313616
model-00035-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ab5187967ed92ebd28040749a6735e6dc5b5b7a921f26ecc320db909b1af3fab
3
+ size 7871313616
model-00036-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5babc5b903c4e8f30a014d6f63fe679331776eb8e19e1807099604085e506349
3
+ size 7871313616
model-00037-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8696f7e938ddd6a8a578467302b58907106add63ec9d06ada3f58db1e1f35b8a
3
+ size 7871313616
model-00038-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:71ad1eda9008b3cca972e8c7637aba3bf598daba0f5f7146fcf70656d6807291
3
+ size 7871313616
model-00039-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dbc174288c0a6dfe1ed30e7b9d0b14b4008cef1c44121ecba5d3fc5711f9ffcf
3
+ size 7871313616
model-00040-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:33bc1f191e6ece41ac959fae053e56c8d9b7aae4e248396db308df192b0dfdca
3
+ size 7871313616
model-00041-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1086567592913cf955a099e6bd36d69803197a5e737895e82d27e139ee797454
3
+ size 7871313616
model-00042-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bd01aed8bd9f3875c999a912c10806379df34083c8d5ea86dbeed586218404e5
3
+ size 7871313616
model-00043-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3757974e615f565fb9179ca721617f92c1c46e02cb27de2173ccf69851a4bbe5
3
+ size 7871313616
model-00044-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:333883124d4baa2d0ad5527832875418c574a9bb9d022184573d73da6b0450bb
3
+ size 7871313616
model-00045-of-00093.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2347fda76deadd548681a1b96d6868eed8c00a37808ece8ff71011a54989c091
3
+ size 7871313616