Instructions to use Pilipdagh/GLM-5 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Pilipdagh/GLM-5 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Pilipdagh/GLM-5")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Pilipdagh/GLM-5")
model = AutoModelForCausalLM.from_pretrained("Pilipdagh/GLM-5")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use Pilipdagh/GLM-5 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Pilipdagh/GLM-5"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Pilipdagh/GLM-5",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Pilipdagh/GLM-5

SGLang

How to use Pilipdagh/GLM-5 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Pilipdagh/GLM-5" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Pilipdagh/GLM-5",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Pilipdagh/GLM-5" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Pilipdagh/GLM-5",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Pilipdagh/GLM-5 with Docker Model Runner:
```
docker model run hf.co/Pilipdagh/GLM-5
```

Pilipdagh

ZHANGYUXUAN-zR commited on Apr 10

Commit

9202efd

0 Parent(s):

Duplicate from zai-org/GLM-5

Browse files

Co-authored-by: zR <ZHANGYUXUAN-zR@users.noreply.huggingface.co>

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

.eval_results/MathArena--aime_2026.yaml +8 -0
.eval_results/MathArena--hmmt_feb_2026.yaml +8 -0
.eval_results/gpqa.yaml +8 -0
.eval_results/hle.yaml +9 -0
.eval_results/hle_with_tools.yaml +10 -0
.eval_results/swe_bench_verified.yaml +19 -0
.eval_results/terminal_bench.yaml +11 -0
.eval_results/terminal_bench_2.yaml +10 -0
.eval_results/yc-bench.yaml +9 -0
.gitattributes +36 -0
README.md +147 -0
chat_template.jinja +86 -0
config.json +59 -0
generation_config.json +12 -0
model-00001-of-00282.safetensors +3 -0
model-00002-of-00282.safetensors +3 -0
model-00003-of-00282.safetensors +3 -0
model-00004-of-00282.safetensors +3 -0
model-00005-of-00282.safetensors +3 -0
model-00006-of-00282.safetensors +3 -0
model-00007-of-00282.safetensors +3 -0
model-00008-of-00282.safetensors +3 -0
model-00009-of-00282.safetensors +3 -0
model-00010-of-00282.safetensors +3 -0
model-00011-of-00282.safetensors +3 -0
model-00012-of-00282.safetensors +3 -0
model-00013-of-00282.safetensors +3 -0
model-00014-of-00282.safetensors +3 -0
model-00015-of-00282.safetensors +3 -0
model-00016-of-00282.safetensors +3 -0
model-00017-of-00282.safetensors +3 -0
model-00018-of-00282.safetensors +3 -0
model-00019-of-00282.safetensors +3 -0
model-00020-of-00282.safetensors +3 -0
model-00021-of-00282.safetensors +3 -0
model-00022-of-00282.safetensors +3 -0
model-00023-of-00282.safetensors +3 -0
model-00024-of-00282.safetensors +3 -0
model-00025-of-00282.safetensors +3 -0
model-00026-of-00282.safetensors +3 -0
model-00027-of-00282.safetensors +3 -0
model-00028-of-00282.safetensors +3 -0
model-00029-of-00282.safetensors +3 -0
model-00030-of-00282.safetensors +3 -0
model-00031-of-00282.safetensors +3 -0
model-00032-of-00282.safetensors +3 -0
model-00033-of-00282.safetensors +3 -0
model-00034-of-00282.safetensors +3 -0
model-00035-of-00282.safetensors +3 -0
model-00036-of-00282.safetensors +3 -0

.eval_results/MathArena--aime_2026.yaml ADDED Viewed

	@@ -0,0 +1,8 @@

+- dataset:
+    id: MathArena/aime_2026
+    task_id: MathArena/aime_2026
+  value: 95.83
+  date: '2026-02-18'
+  source:
+    url: https://matharena.ai/?comp=aime--aime_2026
+    name: Official MathArena Evaluation

.eval_results/MathArena--hmmt_feb_2026.yaml ADDED Viewed

	@@ -0,0 +1,8 @@

+- dataset:
+    id: MathArena/hmmt_feb_2026
+    task_id: MathArena/hmmt_feb_2026
+  value: 86.36
+  date: '2026-02-23'
+  source:
+    url: https://matharena.ai/?comp=hmmt--hmmt_feb_2026
+    name: Official MathArena Evaluation

.eval_results/gpqa.yaml ADDED Viewed

	@@ -0,0 +1,8 @@

+- dataset:
+    id: Idavidrein/gpqa
+    task_id: diamond
+  value: 86.0
+  date: '2026-02-13'
+  source:
+    url: https://huggingface.co/zai-org/GLM-5
+    name: Model Card

.eval_results/hle.yaml ADDED Viewed

	@@ -0,0 +1,9 @@

+- dataset:
+    id: cais/hle
+    task_id: hle
+  value: 30.5
+  date: '2026-02-13'
+  source:
+    url: https://huggingface.co/zai-org/GLM-5
+    name: Model Card
+    user: SaylorTwift

.eval_results/hle_with_tools.yaml ADDED Viewed

	@@ -0,0 +1,10 @@

+- dataset:
+    id: cais/hle
+    task_id: hle
+  value: 50.4
+  date: '2026-02-13'
+  source:
+    url: https://huggingface.co/zai-org/GLM-5
+    name: Model Card
+    user: SaylorTwift
+  notes: "With tools"

.eval_results/swe_bench_verified.yaml ADDED Viewed

	@@ -0,0 +1,19 @@

+- dataset:
+    id: SWE-bench/SWE-bench_Verified
+    task_id: swe_bench_%_resolved
+  value: 72.80
+  source:
+    url: https://www.swebench.com/
+    name: SWE-Bench official evaluation
+    user: nielsr
+  notes: high reasoning, official
+- dataset:
+    id: SWE-bench/SWE-bench_Verified
+    task_id: swe_bench_%_resolved
+  value: 77.8
+  source:
+    url: https://huggingface.co/zai-org/GLM-5/
+    name: Model card
+    user: nielsr
+  notes: Z.ai reported number

.eval_results/terminal_bench.yaml ADDED Viewed

	@@ -0,0 +1,11 @@

+- dataset:
+    id: harborframework/terminal-bench-2.0
+    task_id: terminal_bench
+  value: 52.4
+  date: '2026-02-23'
+  source:
+    url: https://www.tbench.ai/leaderboard/terminal-bench/2.0
+    name: Terminal-Bench Leaderboard
+    user: burtenshaw
+  notes: "agent: Terminus 2"

.eval_results/terminal_bench_2.yaml ADDED Viewed

	@@ -0,0 +1,10 @@

+- dataset:
+    id: harborframework/terminal-bench-2.0
+    task_id: terminalbench_2
+  value: 52.4
+  date: '2026-02-23'
+  source:
+    url: https://www.tbench.ai/leaderboard/terminal-bench/2.0
+    name: Terminal-Bench Leaderboard
+    user: SaylorTwift
+  notes: "agent: Terminus 2"

.eval_results/yc-bench.yaml ADDED Viewed

	@@ -0,0 +1,9 @@

+- dataset:
+    id: collinear-ai/yc-bench
+    task_id: medium
+  value: 1208190
+  date: "2026-03-24"
+  source:
+    url: https://github.com/collinear-ai/yc-bench
+    name: "YC-Bench eval"
+  notes: "avg final funds (USD) across seeds 1,2,3. GLM-5 (via OpenRouter z-ai/glm-5)"

.gitattributes ADDED Viewed

	@@ -0,0 +1,36 @@

+*.7z filter=lfs diff=lfs merge=lfs -text
+*.arrow filter=lfs diff=lfs merge=lfs -text
+*.bin filter=lfs diff=lfs merge=lfs -text
+*.bz2 filter=lfs diff=lfs merge=lfs -text
+*.ckpt filter=lfs diff=lfs merge=lfs -text
+*.ftz filter=lfs diff=lfs merge=lfs -text
+*.gz filter=lfs diff=lfs merge=lfs -text
+*.h5 filter=lfs diff=lfs merge=lfs -text
+*.joblib filter=lfs diff=lfs merge=lfs -text
+*.lfs.* filter=lfs diff=lfs merge=lfs -text
+*.mlmodel filter=lfs diff=lfs merge=lfs -text
+*.model filter=lfs diff=lfs merge=lfs -text
+*.msgpack filter=lfs diff=lfs merge=lfs -text
+*.npy filter=lfs diff=lfs merge=lfs -text
+*.npz filter=lfs diff=lfs merge=lfs -text
+*.onnx filter=lfs diff=lfs merge=lfs -text
+*.ot filter=lfs diff=lfs merge=lfs -text
+*.parquet filter=lfs diff=lfs merge=lfs -text
+*.pb filter=lfs diff=lfs merge=lfs -text
+*.pickle filter=lfs diff=lfs merge=lfs -text
+*.pkl filter=lfs diff=lfs merge=lfs -text
+*.pt filter=lfs diff=lfs merge=lfs -text
+*.pth filter=lfs diff=lfs merge=lfs -text
+*.rar filter=lfs diff=lfs merge=lfs -text
+*.safetensors filter=lfs diff=lfs merge=lfs -text
+saved_model/**/* filter=lfs diff=lfs merge=lfs -text
+*.tar.* filter=lfs diff=lfs merge=lfs -text
+*.tar filter=lfs diff=lfs merge=lfs -text
+*.tflite filter=lfs diff=lfs merge=lfs -text
+*.tgz filter=lfs diff=lfs merge=lfs -text
+*.wasm filter=lfs diff=lfs merge=lfs -text
+*.xz filter=lfs diff=lfs merge=lfs -text
+*.zip filter=lfs diff=lfs merge=lfs -text
+*.zst filter=lfs diff=lfs merge=lfs -text
+*tfevents* filter=lfs diff=lfs merge=lfs -text
+tokenizer.json filter=lfs diff=lfs merge=lfs -text

README.md ADDED Viewed

	@@ -0,0 +1,147 @@

+---
+language:
+- en
+- zh
+library_name: transformers
+license: mit
+pipeline_tag: text-generation
+---
+# GLM-5
+<div align="center">
+<img src=https://raw.githubusercontent.com/zai-org/GLM-5/refs/heads/main/resources/logo.svg width="15%"/>
+</div>
+<p align="center">
+    👋 Join our <a href="https://raw.githubusercontent.com/zai-org/GLM-5/refs/heads/main/resources/wechat.png" target="_blank">WeChat</a> or <a href="https://discord.gg/QR7SARHRxK" target="_blank">Discord</a> community.
+    <br>
+    📖 Check out the GLM-5 <a href="https://z.ai/blog/glm-5" target="_blank">technical blog</a>.
+    <br>
+    📍 Use GLM-5 API services on <a href="https://docs.z.ai/guides/llm/glm-5">Z.ai API Platform. </a>
+    <br>
+    👉 One click to <a href="https://chat.z.ai">GLM-5</a>.
+</p>
+<p align="center">
+    [<a href="https://huggingface.co/papers/2602.15763" target="_blank">Paper</a>]
+    [<a href="https://github.com/zai-org/GLM-5" target="_blank">GitHub</a>]
+</p>
+## Introduction
+We are launching GLM-5, targeting complex systems engineering and long-horizon agentic tasks. Scaling is still one of the most important ways to improve the intelligence efficiency of Artificial General Intelligence (AGI). Compared to GLM-4.5, GLM-5 scales from 355B parameters (32B active) to 744B parameters (40B active), and increases pre-training data from 23T to 28.5T tokens. GLM-5 also integrates DeepSeek Sparse Attention (DSA), largely reducing deployment cost while preserving long-context capacity.
+Reinforcement learning aims to bridge the gap between competence and excellence in pre-trained models. However, deploying it at scale for LLMs is a challenge due to the RL training inefficiency. To this end, we developed [slime](https://github.com/THUDM/slime), a novel **asynchronous RL infrastructure** that substantially improves training throughput and efficiency, enabling more fine-grained post-training iterations. With advances in both pre-training and post-training, GLM-5 delivers significant improvement compared to GLM-4.7 across a wide range of academic benchmarks and achieves best-in-class performance among all open-source models in the world on reasoning, coding, and agentic tasks,  closing the gap with frontier models.
+## Benchmark
+|                                  | GLM-5                  | GLM-4.7   | DeepSeek-V3.2 | Kimi K2.5 | Claude Opus 4.5 | Gemini 3 Pro | GPT-5.2 (xhigh) |
+| -------------------------------- | ---------------------- | --------- | ------------- |-----------| --------------- | ------------ | --------------- |
+| HLE                              | 30.5                   | 24.8      | 25.1          | 31.5      | 28.4            | 37.2         | 35.4            |
+| HLE (w/ Tools)                   | 50.4                   | 42.8      | 40.8          | 51.8      | 43.4*           | 45.8*        | 45.5*           |
+| AIME 2026 I                      | 92.7                   | 92.9      | 92.7          | 92.5      | 93.3            | 90.6         | -               |
+| HMMT Nov. 2025                   | 96.9                   | 93.5      | 90.2          | 91.1      | 91.7            | 93.0         | 97.1            |
+| IMOAnswerBench                   | 82.5                   | 82.0      | 78.3          | 81.8      | 78.5            | 83.3         | 86.3            |
+| GPQA-Diamond                     | 86.0                   | 85.7      | 82.4          | 87.6      | 87.0            | 91.9         | 92.4            |
+| SWE-bench Verified               | 77.8                   | 73.8      | 73.1          | 76.8      | 80.9            | 76.2         | 80.0            |
+| SWE-bench Multilingual           | 73.3                   | 66.7      | 70.2          | 73.0      | 77.5            | 65.0         | 72.0            |
+| Terminal-Bench 2.0 (Terminus 2)  | 56.2 / 60.7 † | 41.0      | 39.3          | 50.8      | 59.3            | 54.2         | 54.0            |
+| Terminal-Bench 2.0 (Claude Code) | 56.2 / 61.1 †  | 32.8      | 46.4          | -         | 57.9            | -            | -               |
+| CyberGym                         | 43.2                   | 23.5      | 17.3          | 41.3      | 50.6            | 39.9         | -               |
+| BrowseComp                       | 62.0                   | 52.0      | 51.4          | 60.6      | 37.0            | 37.8         | -               |
+| BrowseComp (w/ Context Manage)   | 75.9                   | 67.5      | 67.6          | 74.9      | 67.8            | 59.2         | 65.8            |
+| BrowseComp-Zh                    | 72.7                   | 66.6      | 65.0          | 62.3      | 62.4            | 66.8         | 76.1            |
+| τ²-Bench                         | 89.7                   | 87.4      | 85.3          | 80.2      | 91.6            | 90.7         | 85.5            |
+| MCP-Atlas (Public Set)           | 67.8                   | 52.0      | 62.2          | 63.8      | 65.2            | 66.6         | 68.0            |
+| Tool-Decathlon                   | 38.0                   | 23.8      | 35.2          | 27.8      | 43.5            | 36.4         | 46.3            |
+| Vending Bench 2                  | $4,432.12              | $2,376.82 | $1,034.00     | $1,198.46 | $4,967.06       | $5,478.16    | $3,591.33       |
+> *: refers to their scores of full set.
+>
+> †: A verified version of Terminal-Bench 2.0 that fixes some ambiguous instructions.
+See footnote for more evaluation details.
+### Footnote
+* **Humanity’s Last Exam (HLE) & other reasoning tasks**: We evaluate with a maximum generation length of 131,072 tokens (`temperature=1.0, top_p=0.95, max_new_tokens=131072`). By default, we report the text-only subset; results marked with * are from the full set. We use GPT-5.2 (medium) as the judge model. For HLE-with-tools, we use a maximum context length of 202,752 tokens.
+* **SWE-bench & SWE-bench Multilingual**: We run the SWE-bench suite with OpenHands using a tailored instruction prompt. Settings: `temperature=0.7, top_p=0.95, max_new_tokens=16384`, with a 200K context window.
+* **BrowserComp**: Without context management, we retain details from the most recent 5 turns. With context management, we use the same discard-all strategy as DeepSeek-v3.2 and Kimi K2.5.
+* **Terminal-Bench 2.0 (Terminus 2)**: We evaluate with the Terminus framework using `timeout=2h, temperature=0.7, top_p=1.0, max_new_tokens=8192`, with a 128K context window. Resource limits are capped at 16 CPUs and 32 GB RAM.
+* **Terminal-Bench 2.0 (Claude Code)**: We evaluate in Claude Code 2.1.14 (think mode, default effort) with `temperature=1.0, top_p=0.95, max_new_tokens=65536`. We remove wall-clock time limits due to generation speed, while preserving per-task CPU and memory constraints. Scores are averaged over 5 runs. We fix environment issues introduced by Claude Code and also report results on a verified Terminal-Bench 2.0 dataset that resolves ambiguous instructions (see: [https://huggingface.co/datasets/zai-org/terminal-bench-2-verified](https://huggingface.co/datasets/zai-org/terminal-bench-2-verified)).
+* **CyberGym**: We evaluate in Claude Code 2.1.18 (think mode, no web tools) with (`temperature=1.0, top_p=1.0, max_new_tokens=32000`) and a 250-minute timeout per task. Results are single-run Pass@1 over 1,507 tasks.
+* **MCP-Atlas**: All models are evaluated in think mode on the 500-task public subset with a 10-minute timeout per task. We use Gemini 3 Pro as the judge model.
+* **τ²-bench**: We add a small prompt adjustment in Retail and Telecom to avoid failures caused by premature user termination. For Airline, we apply the domain fixes proposed in the Claude Opus 4.5 system card.
+* **Vending Bench 2**: Runs are conducted independently by [Andon Labs](https://andonlabs.com/evals/vending-bench-2).
+## Serve GLM-5 Locally
+### Prepare environment
+The following open-source frameworks support local deployment of GLM-5:
+- [vLLM](https://github.com/vllm-project/vllm) (v0.19.0+)
+- [SGLang](https://github.com/sgl-project/sglang) (v0.5.10+)
+- [KTransformers](https://github.com/kvcache-ai/ktransformers) (v0.5.3+)
+- [Transformers](https://github.com/huggingface/transformers) (v0.5.4+)
+- [xLLM](https://github.com/jd-opensource/xllm) (v0.8.0+)
+### Deploy
++ vLLM
+    ```shell
+    vllm serve zai-org/GLM-5 \
+         --tensor-parallel-size 8 \
+         --gpu-memory-utilization 0.85 \
+         --speculative-config.method mtp \
+         --speculative-config.num_speculative_tokens 3 \
+         --tool-call-parser glm47 \
+         --reasoning-parser glm45 \
+         --enable-auto-tool-choice \
+         --served-model-name glm-5
+    ```
+    Check the [recipes](https://github.com/vllm-project/recipes/blob/main/GLM/GLM5.md) for more details.
++ SGLang
+    ```shell
+    sglang serve \
+      --model-path zai-org/GLM-5 \
+      --tp-size 8 \
+      --tool-call-parser glm47  \
+      --reasoning-parser glm45 \
+      --speculative-algorithm EAGLE \
+      --speculative-num-steps 3 \
+      --speculative-eagle-topk 1 \
+      --speculative-num-draft-tokens 4 \
+      --mem-fraction-static 0.85 \
+      --served-model-name glm-5
+    ```
+    Check the [sglang cookbook](https://cookbook.sglang.io/autoregressive/GLM/GLM-5) for more details.
++ xLLM and other Ascend NPU
+    Please check the deployment guide [here](https://github.com/zai-org/GLM-5/blob/main/example/ascend.md).
++ KTransformers
+    Please check the deployment guide [here](https://github.com/kvcache-ai/ktransformers/blob/main/doc/en/kt-kernel/GLM-5-Tutorial.md).
+## Citation
+If you find GLM-5 useful in your research, please cite our technical report:
+```bibtex
+@misc{glm5team2026glm5vibecodingagentic,
+      title={GLM-5: from Vibe Coding to Agentic Engineering},
+      author={GLM-5-Team and : and Aohan Zeng and Xin Lv and Zhenyu Hou and Zhengxiao Du and Qinkai Zheng and Bin Chen and Da Yin and Chendi Ge and Chenghua Huang and Chengxing Xie and Chenzheng Zhu and Congfeng Yin and Cunxiang Wang and Gengzheng Pan and Hao Zeng and Haoke Zhang and Haoran Wang and Huilong Chen and Jiajie Zhang and Jian Jiao and Jiaqi Guo and Jingsen Wang and Jingzhao Du and Jinzhu Wu and Kedong Wang and Lei Li and Lin Fan and Lucen Zhong and Mingdao Liu and Mingming Zhao and Pengfan Du and Qian Dong and Rui Lu and Shuang-Li and Shulin Cao and Song Liu and Ting Jiang and Xiaodong Chen and Xiaohan Zhang and Xuancheng Huang and Xuezhen Dong and Yabo Xu and Yao Wei and Yifan An and Yilin Niu and Yitong Zhu and Yuanhao Wen and Yukuo Cen and Yushi Bai and Zhongpei Qiao and Zihan Wang and Zikang Wang and Zilin Zhu and Ziqiang Liu and Zixuan Li and Bojie Wang and Bosi Wen and Can Huang and Changpeng Cai and Chao Yu and Chen Li and Chengwei Hu and Chenhui Zhang and Dan Zhang and Daoyan Lin and Dayong Yang and Di Wang and Ding Ai and Erle Zhu and Fangzhou Yi and Feiyu Chen and Guohong Wen and Hailong Sun and Haisha Zhao and Haiyi Hu and Hanchen Zhang and Hanrui Liu and Hanyu Zhang and Hao Peng and Hao Tai and Haobo Zhang and He Liu and Hongwei Wang and Hongxi Yan and Hongyu Ge and Huan Liu and Huanpeng Chu and Jia'ni Zhao and Jiachen Wang and Jiajing Zhao and Jiamin Ren and Jiapeng Wang and Jiaxin Zhang and Jiayi Gui and Jiayue Zhao and Jijie Li and Jing An and Jing Li and Jingwei Yuan and Jinhua Du and Jinxin Liu and Junkai Zhi and Junwen Duan and Kaiyue Zhou and Kangjian Wei and Ke Wang and Keyun Luo and Laiqiang Zhang and Leigang Sha and Liang Xu and Lindong Wu and Lintao Ding and Lu Chen and Minghao Li and Nianyi Lin and Pan Ta and Qiang Zou and Rongjun Song and Ruiqi Yang and Shangqing Tu and Shangtong Yang and Shaoxiang Wu and Shengyan Zhang and Shijie Li and Shuang Li and Shuyi Fan and Wei Qin and Wei Tian and Weining Zhang and Wenbo Yu and Wenjie Liang and Xiang Kuang and Xiangmeng Cheng and Xiangyang Li and Xiaoquan Yan and Xiaowei Hu and Xiaoying Ling and Xing Fan and Xingye Xia and Xinyuan Zhang and Xinze Zhang and Xirui Pan and Xu Zou and Xunkai Zhang and Yadi Liu and Yandong Wu and Yanfu Li and Yidong Wang and Yifan Zhu and Yijun Tan and Yilin Zhou and Yiming Pan and Ying Zhang and Yinpei Su and Yipeng Geng and Yong Yan and Yonglin Tan and Yuean Bi and Yuhan Shen and Yuhao Yang and Yujiang Li and Yunan Liu and Yunqing Wang and Yuntao Li and Yurong Wu and Yutao Zhang and Yuxi Duan and Yuxuan Zhang and Zezhen Liu and Zhengtao Jiang and Zhenhe Yan and Zheyu Zhang and Zhixiang Wei and Zhuo Chen and Zhuoer Feng and Zijun Yao and Ziwei Chai and Ziyuan Wang and Zuzhou Zhang and Bin Xu and Minlie Huang and Hongning Wang and Juanzi Li and Yuxiao Dong and Jie Tang},
+      year={2026},
+      eprint={2602.15763},
+      archivePrefix={arXiv},
+      primaryClass={cs.LG},
+      url={https://arxiv.org/abs/2602.15763},
+}
+```

chat_template.jinja ADDED Viewed

	@@ -0,0 +1,86 @@

+[gMASK]<sop>
+{%- if tools -%}
+<|system|>
+# Tools
+You may call one or more functions to assist with the user query.
+You are provided with function signatures within <tools></tools> XML tags:
+<tools>
+{% for tool in tools %}
+{{ tool | tojson(ensure_ascii=False) }}
+{% endfor %}
+</tools>
+For each function call, output the function name and arguments within the following XML format:
+<tool_call>{function-name}<arg_key>{arg-key-1}</arg_key><arg_value>{arg-value-1}</arg_value><arg_key>{arg-key-2}</arg_key><arg_value>{arg-value-2}</arg_value>...</tool_call>{%- endif -%}
+{%- macro visible_text(content) -%}
+    {%- if content is string -%}
+        {{- content }}
+    {%- elif content is iterable and content is not mapping -%}
+        {%- for item in content -%}
+            {%- if item is mapping and item.type == 'text' -%}
+                {{- item.text }}
+            {%- elif item is string -%}
+                {{- item }}
+            {%- endif -%}
+        {%- endfor -%}
+    {%- else -%}
+        {{- content }}
+    {%- endif -%}
+{%- endmacro -%}
+{%- set ns = namespace(last_user_index=-1) %}
+{%- for m in messages %}
+    {%- if m.role == 'user' %}
+        {%- set ns.last_user_index = loop.index0 -%}
+    {%- endif %}
+{%- endfor %}
+{%- for m in messages -%}
+{%- if m.role == 'user' -%}<|user|>{{ visible_text(m.content) }}
+{%- elif m.role == 'assistant' -%}
+<|assistant|>
+{%- set reasoning_content = '' %}
+{%- set content = visible_text(m.content) %}
+{%- if m.reasoning_content is string %}
+    {%- set reasoning_content = m.reasoning_content %}
+{%- else %}
+    {%- if '</think>' in content %}
+        {%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
+        {%- set content = content.split('</think>')[-1].lstrip('\n') %}
+    {%- endif %}
+{%- endif %}
+{%- if ((clear_thinking is defined and not clear_thinking) or loop.index0 > ns.last_user_index) and reasoning_content -%}
+{{ '<think>' + reasoning_content.strip() +  '</think>'}}
+{%- else -%}
+{{ '</think>' }}
+{%- endif -%}
+{%- if content.strip() -%}
+{{ content.strip() }}
+{%- endif -%}
+{% if m.tool_calls %}
+{% for tc in m.tool_calls %}
+{%- if tc.function %}
+    {%- set tc = tc.function %}
+{%- endif %}
+{{- '<tool_call>' + tc.name -}}
+{% set _args = tc.arguments %}{% for k, v in _args.items() %}<arg_key>{{ k }}</arg_key><arg_value>{{ v | tojson(ensure_ascii=False) if v is not string else v }}</arg_value>{% endfor %}</tool_call>{% endfor %}
+{% endif %}
+{%- elif m.role == 'tool' -%}
+{%- if m.content is string -%}
+{%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
+    {{- '<|observation|>' }}
+{%- endif %}
+{{- '<tool_response>' }}
+{{- m.content }}
+{{- '</tool_response>' }}
+{%- else -%}
+<|observation|>{% for tr in m.content %}
+<tool_response>{{ tr.output if tr.output is defined else tr }}</tool_response>{% endfor -%}
+{% endif -%}
+{%- elif m.role == 'system' -%}
+<|system|>{{ visible_text(m.content) }}
+{%- endif -%}
+{%- endfor -%}
+{%- if add_generation_prompt -%}
+    <|assistant|>{{- '</think>' if (enable_thinking is defined and not enable_thinking) else '<think>' -}}
+{%- endif -%}

config.json ADDED Viewed

	@@ -0,0 +1,59 @@

+{
+  "architectures": [
+    "GlmMoeDsaForCausalLM"
+  ],
+  "attention_bias": false,
+  "attention_dropout": 0.0,
+  "dtype": "bfloat16",
+  "eos_token_id": [
+    154820,
+    154827,
+    154829
+  ],
+  "ep_size": 1,
+  "first_k_dense_replace": 3,
+  "hidden_act": "silu",
+  "head_dim": 64,
+  "hidden_size": 6144,
+  "index_head_dim": 128,
+  "index_n_heads": 32,
+  "index_topk": 2048,
+  "indexer_rope_interleave": true,
+  "initializer_range": 0.02,
+  "intermediate_size": 12288,
+  "kv_lora_rank": 512,
+  "max_position_embeddings": 202752,
+  "moe_intermediate_size": 2048,
+  "moe_layer_freq": 1,
+  "model_type": "glm_moe_dsa",
+  "n_group": 1,
+  "n_routed_experts": 256,
+  "n_shared_experts": 1,
+  "norm_topk_prob": true,
+  "num_attention_heads": 64,
+  "num_experts_per_tok": 8,
+  "num_hidden_layers": 78,
+  "num_key_value_heads": 64,
+  "num_nextn_predict_layers": 1,
+  "pad_token_id": 154820,
+  "pretraining_tp": 1,
+  "q_lora_rank": 2048,
+  "qk_head_dim": 256,
+  "qk_nope_head_dim": 192,
+  "qk_rope_head_dim": 64,
+  "rms_norm_eps": 1e-05,
+  "rope_interleave": true,
+  "rope_parameters": {
+    "rope_theta": 1000000,
+    "rope_type": "default"
+  },
+  "routed_scaling_factor": 2.5,
+  "scoring_func": "sigmoid",
+  "tie_word_embeddings": false,
+  "topk_group": 1,
+  "topk_method": "noaux_tc",
+  "transformers_version": "5.0.2.dev0",
+  "use_cache": true,
+  "v_head_dim": 256,
+  "vocab_size": 154880
+}

generation_config.json ADDED Viewed

	@@ -0,0 +1,12 @@

+{
+  "_from_model_config": true,
+  "eos_token_id": [
+    154820,
+    154827,
+    154829
+  ],
+  "pad_token_id": 154820,
+  "temperature": 1.0,
+  "top_p": 0.95,
+  "transformers_version": "5.0.2.dev0"
+}

model-00001-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:198ef923a7ca4effc5ead8ebf799fee10beb8ce081352fb099636f805d1deda9
+size 5342821416

model-00002-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f41e14a00f8f65b5ddae90afc9aee29745bbe38821090b78dfc18f8b2be25738
+size 5351970840

model-00003-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f5f696f543a4473b272bb38068bec2a5fa2ce6b00306b99e7b7264ac9a4784ea
+size 5360347320

model-00004-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:21ea9316694ec51d00be3a54f4c90c6b8c927e9f9852aa673234ce10ea46b092
+size 5360347208

model-00005-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:447d3150fda6a4a95063c52402cc32a5e185e94ccb8ab1162463a6dec34d1130
+size 5359985352

model-00006-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:5bd92a8fb83f562439f97700404160da10f43d0c18b0e2ac900128c921ddd8d7
+size 5360347320

model-00007-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ca57320e2c33917441d7ade047ea155fc630d0fa1f7933946e209486174aa12b
+size 5360347320

model-00008-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e848c982bf3fdcb5b7d8704e20e4b30aaff0d85f7314713dc82f3e069fb2d2cd
+size 5360347144

model-00009-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:67d50e69f80e4f0959e6af2ed383aad04d3e930c5788b148b692df10652e8953
+size 5359985416

model-00010-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9a18aa0bb8201311c1c31e462e3885f26299e6d025d9ff9cef40c899df3384e0
+size 5360347312

model-00011-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:86ae5f0fe4ca9bd26926ff5feb2f8eea2524ff851bac18920180a74d5829f01b
+size 5360347288

model-00012-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1da09cf92af76c5271f500bb204b581e8fa79d086ea0b980f739c00e75a13d83
+size 5363494088

model-00013-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4c06dc21e2bd376abdcf28dea362f00511f50a449bbcec299fe6cd1842d248ce
+size 5356838488

model-00014-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:213d05d35132ab6c76771ccc87d9798b3417505f1456900fc111d1aa2dafafb5
+size 5360347312

model-00015-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b04e317d274bed49d942c5d6f99565ab80149a588528ce9208d9cc3c9000177f
+size 5360347224

model-00016-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:25f73b1e9e9fde8f9588253b0c2a9884ff115ccacf4114ad37be355ce2361454
+size 5359985336

model-00017-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:18d47fcb369b742e775ddc25ca499858533255ec50e0e76f2405b1f3d92b7b70
+size 5360347320

model-00018-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:62e23e54f40709fc762c172444a5dee41e78111df151fdbe7b48d75386a36deb
+size 5360347312

model-00019-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7599267bb1fc68aeb5963661fb9f85b3e997b5752b278b27389e0560052c919a
+size 5360347160

model-00020-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:681e153de4a650fa6667b55f48b1ed424741ef1633b772352f5c069b3a6e0fbd
+size 5359985400

model-00021-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:78b0212db6f9f6eaa9fec4a317c1fe418215903c5b3dd6cc8de9700dfcbc5279
+size 5360347320

model-00022-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:947ef24293fcc3c8373eb1be27c0c3a15321e182394820249150617954e43043
+size 5360347296

model-00023-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1a9b4276d6c19121b5bc7da52624b6f6cc84e69160dd8cea449c2d0c4c35d5ac
+size 5360347112

model-00024-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3c800de893450a11db9d3784fc7bd97d946ed003379c32b2e7e75e2052f538ac
+size 5359985464

model-00025-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:21446e1f7a06a579eacb09ea0f86f88c4d200bab1cfc46eb59276ab5c69ba147
+size 5360347320

model-00026-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6c35662afd872d3d83f028523a774f3bf3873b8fd2772553097bd5712a28a5c4
+size 5360347232

model-00027-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4a7a3615c744098411b2bafc0061a521c3348e4002f553810c0932b39485f5a2
+size 5359985328

model-00028-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:778444540ff8a43a51574b81a75eba0a15aa3df1bb4308659995b8f05d4a29a4
+size 5360347320

model-00029-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d908c8177c1dfe877627baf157fa86bd3c5e6d7c2eea5cd8c84f0eacb12559a4
+size 5360347320

model-00030-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6df7b340183ad645f171edec945c107f4c417b2421933a99071f3f355dfe6e7c
+size 5360347168

model-00031-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a658a7bec4268a3d059bae15dab8d106ec071c8f72d66410430d81eff06696ae
+size 5359985392

model-00032-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f291e238a4ca7686a833566aeaacc1706b66daa81dc9aab5de35e0670651b2af
+size 5360347320

model-00033-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d58b7681a194c6cf533a852c5780799df82cdb9776c5b24cb30778149645d7cb
+size 5360347312

model-00034-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2f9666db39fcfad212eddb74581d83d9da9a16366080cdb2fcb1d0a6797db56d
+size 5360347104

model-00035-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:387257cc5daff11c898836a457b8ba3b889d31e6756e7dde5f4bb6610909573a
+size 5359985456

model-00036-of-00282.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:db004f95e3d6970343042fa1fef65528db625b1d954fa14f7c61b797c2b9d9bb
+size 5360347320