zeekay commited on
Commit
b0b1d91
·
verified ·
1 Parent(s): d2f19c6

Upload folder using huggingface_hub

Browse files
README.md CHANGED
@@ -1,228 +1,30 @@
1
  ---
2
- license: mit
3
  language:
4
  - en
5
  - zh
6
  library_name: transformers
 
7
  pipeline_tag: text-generation
8
  tags:
9
- - zen
10
- - code
11
- - moe
12
- - glm
13
- - coding
14
- - programming
15
- - software-engineering
16
  base_model: zai-org/GLM-4.7-Flash
17
- model-index:
18
- - name: zen-coder-flash
19
- results:
20
- - task:
21
- type: text-generation
22
- name: Code Generation
23
- dataset:
24
- name: SWE-bench Verified
25
- type: swe-bench
26
- metrics:
27
- - type: accuracy
28
- value: 59.2
29
- name: SWE-bench Verified
30
- - task:
31
- type: text-generation
32
- name: Mathematical Reasoning
33
- dataset:
34
- name: AIME 2025
35
- type: aime
36
- metrics:
37
- - type: accuracy
38
- value: 91.6
39
- name: AIME 2025
40
  ---
 
41
 
42
- # Zen Coder Flash
43
-
44
- <div align="center">
45
- <img src="https://zenlm.org/logo.png" alt="Zen AI" width="200"/>
46
-
47
- **The Flagship Zen Coder Model**
48
-
49
- [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
50
- [![HuggingFace](https://img.shields.io/badge/🤗-zenlm%2Fzen--coder--flash-blue)](https://huggingface.co/zenlm/zen-coder-flash)
51
- </div>
52
-
53
- ## Overview
54
-
55
- **Zen Coder Flash** is the flagship code-focused model in the Zen AI family. Built on GLM-4.7-Flash's cutting-edge Mixture of Experts architecture, it delivers frontier coding performance with practical efficiency.
56
-
57
- | Attribute | Value |
58
- |-----------|-------|
59
- | **Parameters** | 31B total / 3B active (MoE) |
60
- | **Context Length** | 131,072 tokens |
61
- | **Base Model** | [GLM-4.7-Flash](https://huggingface.co/zai-org/GLM-4.7-Flash) |
62
- | **License** | MIT |
63
- | **Languages** | 100+ programming languages |
64
-
65
- ## Why Zen Coder Flash?
66
-
67
- - **59.2% SWE-bench** vs 22% Qwen3-30B - nearly **3x better** at real coding tasks
68
- - **Efficient MoE**: 31B params but only 3B active per token
69
- - **131K context**: Handle entire codebases in a single prompt
70
- - **Native tool calling**: Built-in function execution support
71
- - **Reasoning mode**: Extended chain-of-thought for complex problems
72
-
73
- ## Performance
74
-
75
- | Benchmark | Score | vs Qwen3-30B |
76
- |-----------|-------|--------------|
77
- | SWE-bench Verified | **59.2%** | +37.2% (2.7x) |
78
- | AIME 2025 | **91.6%** | +6.6% |
79
- | GPQA | **75.2%** | +1.8% |
80
- | τ²-Bench | **79.5%** | +30.5% |
81
-
82
- ## Zen Coder Family
83
-
84
- | Tier | Model | Parameters | Active | Use Case |
85
- |------|-------|------------|--------|----------|
86
- | Small | [zen-coder-4b](https://huggingface.co/zenlm/zen-coder) | 4B | 4B | Edge/mobile |
87
- | **Flagship** | **zen-coder-flash** | **31B MoE** | **3B** | **Balanced** |
88
- | Max | [zen-max](https://huggingface.co/zenlm/zen-max) | 671B MoE | 14B | Frontier |
89
-
90
- ## Quick Start
91
-
92
- ### Transformers
93
-
94
- ```python
95
- import torch
96
- from transformers import AutoModelForCausalLM, AutoTokenizer
97
-
98
- model_id = "zenlm/zen-coder-flash"
99
-
100
- tokenizer = AutoTokenizer.from_pretrained(model_id)
101
- model = AutoModelForCausalLM.from_pretrained(
102
- model_id,
103
- torch_dtype=torch.bfloat16,
104
- device_map="auto",
105
- )
106
-
107
- messages = [{"role": "user", "content": "Write a Python function to find all prime numbers up to n using the Sieve of Eratosthenes"}]
108
-
109
- inputs = tokenizer.apply_chat_template(
110
- messages,
111
- tokenize=True,
112
- add_generation_prompt=True,
113
- return_dict=True,
114
- return_tensors="pt",
115
- ).to(model.device)
116
 
117
- outputs = model.generate(**inputs, max_new_tokens=512, do_sample=True, temperature=0.7)
118
- response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
119
- print(response)
120
- ```
121
 
122
- ### vLLM (Recommended for Production)
123
 
124
- ```bash
125
- vllm serve zenlm/zen-coder-flash \
126
- --tensor-parallel-size 4 \
127
- --speculative-config.method mtp \
128
- --speculative-config.num_speculative_tokens 1 \
129
- --tool-call-parser glm47 \
130
- --reasoning-parser glm45 \
131
- --enable-auto-tool-choice
132
- ```
133
 
134
- ### SGLang
135
 
136
- ```bash
137
- python -m sglang.launch_server \
138
- --model-path zenlm/zen-coder-flash \
139
- --tp-size 4 \
140
- --tool-call-parser glm47 \
141
- --reasoning-parser glm45 \
142
- --speculative-algorithm EAGLE \
143
- --speculative-num-steps 3
144
- ```
145
 
146
- ### MLX (Apple Silicon)
147
-
148
- ```python
149
- from mlx_lm import load, generate
150
-
151
- model, tokenizer = load("zenlm/zen-coder-flash")
152
- response = generate(model, tokenizer, prompt="Write a Rust function for binary search", max_tokens=256)
153
- print(response)
154
- ```
155
-
156
- ## Capabilities
157
-
158
- ### Code Generation
159
- - 100+ programming languages
160
- - Framework-aware completions
161
- - Test generation
162
- - Documentation generation
163
-
164
- ### Debugging & Analysis
165
- - Bug detection and fixes
166
- - Code review
167
- - Performance optimization
168
- - Security analysis
169
-
170
- ### Software Engineering
171
- - Architecture design
172
- - API design
173
- - Refactoring suggestions
174
- - Migration assistance
175
-
176
- ### Tool Calling
177
- ```python
178
- # Native function calling support
179
- tools = [
180
- {
181
- "type": "function",
182
- "function": {
183
- "name": "run_tests",
184
- "description": "Run test suite",
185
- "parameters": {"type": "object", "properties": {}}
186
- }
187
- }
188
- ]
189
- ```
190
-
191
- ## Identity
192
-
193
- I am **Zen Coder Flash**, the flagship code-focused model in the Zen AI family. I combine GLM-4.7's cutting-edge MoE architecture with Zen's philosophy of clarity and efficiency. With 31 billion parameters (only 3B active per token) and 131K context, I deliver frontier coding capability that's practical to deploy.
194
-
195
- ## Training
196
-
197
- Zen Coder Flash is built through identity fine-tuning on GLM-4.7-Flash using MLX LoRA on Apple Silicon. The training emphasizes:
198
-
199
- - Zen identity and persona
200
- - Code-focused instruction following
201
- - Tool calling capabilities
202
- - Extended reasoning patterns
203
-
204
- ## Citation
205
-
206
- ```bibtex
207
- @misc{zen-coder-flash-2025,
208
- title={Zen Coder Flash: Efficient Frontier Code Generation},
209
- author={Hanzo AI},
210
- year={2025},
211
- url={https://huggingface.co/zenlm/zen-coder-flash}
212
- }
213
- ```
214
-
215
- ## Links
216
-
217
- - **Website**: [zenlm.org](https://zenlm.org)
218
- - **GitHub**: [zenlm/zen](https://github.com/zenlm/zen)
219
- - **Base Model**: [GLM-4.7-Flash](https://huggingface.co/zai-org/GLM-4.7-Flash)
220
- - **Organization**: [Hanzo AI](https://hanzo.ai)
221
-
222
- ## License
223
-
224
- MIT License - inherited from GLM-4.7-Flash base model.
225
-
226
- ---
227
 
228
- *Zen AI: Clarity Through Intelligence*
 
1
  ---
 
2
  language:
3
  - en
4
  - zh
5
  library_name: transformers
6
+ license: mit
7
  pipeline_tag: text-generation
8
  tags:
9
+ - mlx
 
 
 
 
 
 
10
  base_model: zai-org/GLM-4.7-Flash
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  ---
12
+ ## 💫 Community Model> GLM-4.7-Flash by zai-org
13
 
14
+ _👾 [LM Studio](https://lmstudio.ai) Community models highlights program. Highlighting new & noteworthy models by the community. Join the conversation on [Discord](https://discord.gg/aPQfnNkxGC)_.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
 
16
+ **Model creator**: [zai-org](https://huggingface.co/zai-org)<br>
17
+ **Original model**: [GLM-4.7-Flash](https://huggingface.co/zai-org/GLM-4.7-Flash)<br>
18
+ **MLX quantization**: provided by [LM Studio team](https://x.com/lmstudio) using [mlx_lm](https://github.com/ml-explore/mlx-lm)<br>
 
19
 
20
+ ## Technical Details
21
 
22
+ 8-bit quantized version of GLM-4.7-Flash using MLX, optimized for Apple Silicon.
 
 
 
 
 
 
 
 
23
 
24
+ ## Special thanks
25
 
26
+ 🙏 Special thanks to the [Apple Machine Learning Research](https://github.com/ml-explore) team for creating [MLX](https://github.com/ml-explore/mlx).
 
 
 
 
 
 
 
 
27
 
28
+ ## Disclaimers
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
 
30
+ LM Studio is not the creator, originator, or owner of any Model featured in the Community Model Program. Each Community Model is created and provided by third parties. LM Studio does not endorse, support, represent or guarantee the completeness, truthfulness, accuracy, or reliability of any Community Model. You understand that Community Models can produce content that might be offensive, harmful, inaccurate or otherwise inappropriate, or deceptive. Each Community Model is the sole responsibility of the person or entity who originated such Model. LM Studio may not monitor or control the Community Models and cannot, and does not, take responsibility for any such Model. LM Studio disclaims all warranties or guarantees about the accuracy, reliability or benefits of the Community Models. LM Studio further disclaims any warranty that the Community Model will meet your requirements, be secure, uninterrupted or available at any time or location, or error-free, viruses-free, or that any errors will be corrected, or otherwise. You will be solely responsible for any damage resulting from your use of or access to the Community Models, your downloading of any Community Model, or use of any other Community Model provided by or through LM Studio.
model-00001-of-00007.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8a550dabf6e2789a9d704211d75c681a27dd9d75e037c468e6d3fe25e797dfc8
3
+ size 5176178595
model-00002-of-00007.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:05a51988a3602965ea8f21e0766240c8d890321fc3e219adcaa3d8b6108bb327
3
+ size 5368050997
model-00003-of-00007.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0d0ed3c08f419f5c7ad90e96933948eaf4cd5d3b410dd9b2a3ffeb652ce026e0
3
+ size 5187037498
model-00004-of-00007.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cf1c40d389c2d327844f6c7b0597d9ad519b059887c98f503ec87b3d00014375
3
+ size 5187300215
model-00005-of-00007.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:94c98e48509c361cfc93ce786d6d2b55270595e9452bd521506e9e664ff79ff6
3
+ size 5187300077
model-00006-of-00007.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b4335a3cad7c45bdc5377c7c3a1a6f31f2a1ab9a9b2022419749d8d738d36343
3
+ size 5368051110
model-00007-of-00007.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bc77ebb277c9f56a498eee9daefbe245464569f1517408961dfbde02ab653b3a
3
+ size 347059898
model.safetensors.index.json ADDED
The diff for this file is too large to render. See raw diff