Upload folder using huggingface_hub
Browse files- README.md +13 -211
- model-00001-of-00007.safetensors +3 -0
- model-00002-of-00007.safetensors +3 -0
- model-00003-of-00007.safetensors +3 -0
- model-00004-of-00007.safetensors +3 -0
- model-00005-of-00007.safetensors +3 -0
- model-00006-of-00007.safetensors +3 -0
- model-00007-of-00007.safetensors +3 -0
- model.safetensors.index.json +0 -0
README.md
CHANGED
|
@@ -1,228 +1,30 @@
|
|
| 1 |
---
|
| 2 |
-
license: mit
|
| 3 |
language:
|
| 4 |
- en
|
| 5 |
- zh
|
| 6 |
library_name: transformers
|
|
|
|
| 7 |
pipeline_tag: text-generation
|
| 8 |
tags:
|
| 9 |
-
-
|
| 10 |
-
- code
|
| 11 |
-
- moe
|
| 12 |
-
- glm
|
| 13 |
-
- coding
|
| 14 |
-
- programming
|
| 15 |
-
- software-engineering
|
| 16 |
base_model: zai-org/GLM-4.7-Flash
|
| 17 |
-
model-index:
|
| 18 |
-
- name: zen-coder-flash
|
| 19 |
-
results:
|
| 20 |
-
- task:
|
| 21 |
-
type: text-generation
|
| 22 |
-
name: Code Generation
|
| 23 |
-
dataset:
|
| 24 |
-
name: SWE-bench Verified
|
| 25 |
-
type: swe-bench
|
| 26 |
-
metrics:
|
| 27 |
-
- type: accuracy
|
| 28 |
-
value: 59.2
|
| 29 |
-
name: SWE-bench Verified
|
| 30 |
-
- task:
|
| 31 |
-
type: text-generation
|
| 32 |
-
name: Mathematical Reasoning
|
| 33 |
-
dataset:
|
| 34 |
-
name: AIME 2025
|
| 35 |
-
type: aime
|
| 36 |
-
metrics:
|
| 37 |
-
- type: accuracy
|
| 38 |
-
value: 91.6
|
| 39 |
-
name: AIME 2025
|
| 40 |
---
|
|
|
|
| 41 |
|
| 42 |
-
|
| 43 |
-
|
| 44 |
-
<div align="center">
|
| 45 |
-
<img src="https://zenlm.org/logo.png" alt="Zen AI" width="200"/>
|
| 46 |
-
|
| 47 |
-
**The Flagship Zen Coder Model**
|
| 48 |
-
|
| 49 |
-
[](https://opensource.org/licenses/MIT)
|
| 50 |
-
[](https://huggingface.co/zenlm/zen-coder-flash)
|
| 51 |
-
</div>
|
| 52 |
-
|
| 53 |
-
## Overview
|
| 54 |
-
|
| 55 |
-
**Zen Coder Flash** is the flagship code-focused model in the Zen AI family. Built on GLM-4.7-Flash's cutting-edge Mixture of Experts architecture, it delivers frontier coding performance with practical efficiency.
|
| 56 |
-
|
| 57 |
-
| Attribute | Value |
|
| 58 |
-
|-----------|-------|
|
| 59 |
-
| **Parameters** | 31B total / 3B active (MoE) |
|
| 60 |
-
| **Context Length** | 131,072 tokens |
|
| 61 |
-
| **Base Model** | [GLM-4.7-Flash](https://huggingface.co/zai-org/GLM-4.7-Flash) |
|
| 62 |
-
| **License** | MIT |
|
| 63 |
-
| **Languages** | 100+ programming languages |
|
| 64 |
-
|
| 65 |
-
## Why Zen Coder Flash?
|
| 66 |
-
|
| 67 |
-
- **59.2% SWE-bench** vs 22% Qwen3-30B - nearly **3x better** at real coding tasks
|
| 68 |
-
- **Efficient MoE**: 31B params but only 3B active per token
|
| 69 |
-
- **131K context**: Handle entire codebases in a single prompt
|
| 70 |
-
- **Native tool calling**: Built-in function execution support
|
| 71 |
-
- **Reasoning mode**: Extended chain-of-thought for complex problems
|
| 72 |
-
|
| 73 |
-
## Performance
|
| 74 |
-
|
| 75 |
-
| Benchmark | Score | vs Qwen3-30B |
|
| 76 |
-
|-----------|-------|--------------|
|
| 77 |
-
| SWE-bench Verified | **59.2%** | +37.2% (2.7x) |
|
| 78 |
-
| AIME 2025 | **91.6%** | +6.6% |
|
| 79 |
-
| GPQA | **75.2%** | +1.8% |
|
| 80 |
-
| τ²-Bench | **79.5%** | +30.5% |
|
| 81 |
-
|
| 82 |
-
## Zen Coder Family
|
| 83 |
-
|
| 84 |
-
| Tier | Model | Parameters | Active | Use Case |
|
| 85 |
-
|------|-------|------------|--------|----------|
|
| 86 |
-
| Small | [zen-coder-4b](https://huggingface.co/zenlm/zen-coder) | 4B | 4B | Edge/mobile |
|
| 87 |
-
| **Flagship** | **zen-coder-flash** | **31B MoE** | **3B** | **Balanced** |
|
| 88 |
-
| Max | [zen-max](https://huggingface.co/zenlm/zen-max) | 671B MoE | 14B | Frontier |
|
| 89 |
-
|
| 90 |
-
## Quick Start
|
| 91 |
-
|
| 92 |
-
### Transformers
|
| 93 |
-
|
| 94 |
-
```python
|
| 95 |
-
import torch
|
| 96 |
-
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 97 |
-
|
| 98 |
-
model_id = "zenlm/zen-coder-flash"
|
| 99 |
-
|
| 100 |
-
tokenizer = AutoTokenizer.from_pretrained(model_id)
|
| 101 |
-
model = AutoModelForCausalLM.from_pretrained(
|
| 102 |
-
model_id,
|
| 103 |
-
torch_dtype=torch.bfloat16,
|
| 104 |
-
device_map="auto",
|
| 105 |
-
)
|
| 106 |
-
|
| 107 |
-
messages = [{"role": "user", "content": "Write a Python function to find all prime numbers up to n using the Sieve of Eratosthenes"}]
|
| 108 |
-
|
| 109 |
-
inputs = tokenizer.apply_chat_template(
|
| 110 |
-
messages,
|
| 111 |
-
tokenize=True,
|
| 112 |
-
add_generation_prompt=True,
|
| 113 |
-
return_dict=True,
|
| 114 |
-
return_tensors="pt",
|
| 115 |
-
).to(model.device)
|
| 116 |
|
| 117 |
-
|
| 118 |
-
|
| 119 |
-
|
| 120 |
-
```
|
| 121 |
|
| 122 |
-
|
| 123 |
|
| 124 |
-
|
| 125 |
-
vllm serve zenlm/zen-coder-flash \
|
| 126 |
-
--tensor-parallel-size 4 \
|
| 127 |
-
--speculative-config.method mtp \
|
| 128 |
-
--speculative-config.num_speculative_tokens 1 \
|
| 129 |
-
--tool-call-parser glm47 \
|
| 130 |
-
--reasoning-parser glm45 \
|
| 131 |
-
--enable-auto-tool-choice
|
| 132 |
-
```
|
| 133 |
|
| 134 |
-
|
| 135 |
|
| 136 |
-
|
| 137 |
-
python -m sglang.launch_server \
|
| 138 |
-
--model-path zenlm/zen-coder-flash \
|
| 139 |
-
--tp-size 4 \
|
| 140 |
-
--tool-call-parser glm47 \
|
| 141 |
-
--reasoning-parser glm45 \
|
| 142 |
-
--speculative-algorithm EAGLE \
|
| 143 |
-
--speculative-num-steps 3
|
| 144 |
-
```
|
| 145 |
|
| 146 |
-
|
| 147 |
-
|
| 148 |
-
```python
|
| 149 |
-
from mlx_lm import load, generate
|
| 150 |
-
|
| 151 |
-
model, tokenizer = load("zenlm/zen-coder-flash")
|
| 152 |
-
response = generate(model, tokenizer, prompt="Write a Rust function for binary search", max_tokens=256)
|
| 153 |
-
print(response)
|
| 154 |
-
```
|
| 155 |
-
|
| 156 |
-
## Capabilities
|
| 157 |
-
|
| 158 |
-
### Code Generation
|
| 159 |
-
- 100+ programming languages
|
| 160 |
-
- Framework-aware completions
|
| 161 |
-
- Test generation
|
| 162 |
-
- Documentation generation
|
| 163 |
-
|
| 164 |
-
### Debugging & Analysis
|
| 165 |
-
- Bug detection and fixes
|
| 166 |
-
- Code review
|
| 167 |
-
- Performance optimization
|
| 168 |
-
- Security analysis
|
| 169 |
-
|
| 170 |
-
### Software Engineering
|
| 171 |
-
- Architecture design
|
| 172 |
-
- API design
|
| 173 |
-
- Refactoring suggestions
|
| 174 |
-
- Migration assistance
|
| 175 |
-
|
| 176 |
-
### Tool Calling
|
| 177 |
-
```python
|
| 178 |
-
# Native function calling support
|
| 179 |
-
tools = [
|
| 180 |
-
{
|
| 181 |
-
"type": "function",
|
| 182 |
-
"function": {
|
| 183 |
-
"name": "run_tests",
|
| 184 |
-
"description": "Run test suite",
|
| 185 |
-
"parameters": {"type": "object", "properties": {}}
|
| 186 |
-
}
|
| 187 |
-
}
|
| 188 |
-
]
|
| 189 |
-
```
|
| 190 |
-
|
| 191 |
-
## Identity
|
| 192 |
-
|
| 193 |
-
I am **Zen Coder Flash**, the flagship code-focused model in the Zen AI family. I combine GLM-4.7's cutting-edge MoE architecture with Zen's philosophy of clarity and efficiency. With 31 billion parameters (only 3B active per token) and 131K context, I deliver frontier coding capability that's practical to deploy.
|
| 194 |
-
|
| 195 |
-
## Training
|
| 196 |
-
|
| 197 |
-
Zen Coder Flash is built through identity fine-tuning on GLM-4.7-Flash using MLX LoRA on Apple Silicon. The training emphasizes:
|
| 198 |
-
|
| 199 |
-
- Zen identity and persona
|
| 200 |
-
- Code-focused instruction following
|
| 201 |
-
- Tool calling capabilities
|
| 202 |
-
- Extended reasoning patterns
|
| 203 |
-
|
| 204 |
-
## Citation
|
| 205 |
-
|
| 206 |
-
```bibtex
|
| 207 |
-
@misc{zen-coder-flash-2025,
|
| 208 |
-
title={Zen Coder Flash: Efficient Frontier Code Generation},
|
| 209 |
-
author={Hanzo AI},
|
| 210 |
-
year={2025},
|
| 211 |
-
url={https://huggingface.co/zenlm/zen-coder-flash}
|
| 212 |
-
}
|
| 213 |
-
```
|
| 214 |
-
|
| 215 |
-
## Links
|
| 216 |
-
|
| 217 |
-
- **Website**: [zenlm.org](https://zenlm.org)
|
| 218 |
-
- **GitHub**: [zenlm/zen](https://github.com/zenlm/zen)
|
| 219 |
-
- **Base Model**: [GLM-4.7-Flash](https://huggingface.co/zai-org/GLM-4.7-Flash)
|
| 220 |
-
- **Organization**: [Hanzo AI](https://hanzo.ai)
|
| 221 |
-
|
| 222 |
-
## License
|
| 223 |
-
|
| 224 |
-
MIT License - inherited from GLM-4.7-Flash base model.
|
| 225 |
-
|
| 226 |
-
---
|
| 227 |
|
| 228 |
-
|
|
|
|
| 1 |
---
|
|
|
|
| 2 |
language:
|
| 3 |
- en
|
| 4 |
- zh
|
| 5 |
library_name: transformers
|
| 6 |
+
license: mit
|
| 7 |
pipeline_tag: text-generation
|
| 8 |
tags:
|
| 9 |
+
- mlx
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 10 |
base_model: zai-org/GLM-4.7-Flash
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
---
|
| 12 |
+
## 💫 Community Model> GLM-4.7-Flash by zai-org
|
| 13 |
|
| 14 |
+
_👾 [LM Studio](https://lmstudio.ai) Community models highlights program. Highlighting new & noteworthy models by the community. Join the conversation on [Discord](https://discord.gg/aPQfnNkxGC)_.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
|
| 16 |
+
**Model creator**: [zai-org](https://huggingface.co/zai-org)<br>
|
| 17 |
+
**Original model**: [GLM-4.7-Flash](https://huggingface.co/zai-org/GLM-4.7-Flash)<br>
|
| 18 |
+
**MLX quantization**: provided by [LM Studio team](https://x.com/lmstudio) using [mlx_lm](https://github.com/ml-explore/mlx-lm)<br>
|
|
|
|
| 19 |
|
| 20 |
+
## Technical Details
|
| 21 |
|
| 22 |
+
8-bit quantized version of GLM-4.7-Flash using MLX, optimized for Apple Silicon.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 23 |
|
| 24 |
+
## Special thanks
|
| 25 |
|
| 26 |
+
🙏 Special thanks to the [Apple Machine Learning Research](https://github.com/ml-explore) team for creating [MLX](https://github.com/ml-explore/mlx).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 27 |
|
| 28 |
+
## Disclaimers
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 29 |
|
| 30 |
+
LM Studio is not the creator, originator, or owner of any Model featured in the Community Model Program. Each Community Model is created and provided by third parties. LM Studio does not endorse, support, represent or guarantee the completeness, truthfulness, accuracy, or reliability of any Community Model. You understand that Community Models can produce content that might be offensive, harmful, inaccurate or otherwise inappropriate, or deceptive. Each Community Model is the sole responsibility of the person or entity who originated such Model. LM Studio may not monitor or control the Community Models and cannot, and does not, take responsibility for any such Model. LM Studio disclaims all warranties or guarantees about the accuracy, reliability or benefits of the Community Models. LM Studio further disclaims any warranty that the Community Model will meet your requirements, be secure, uninterrupted or available at any time or location, or error-free, viruses-free, or that any errors will be corrected, or otherwise. You will be solely responsible for any damage resulting from your use of or access to the Community Models, your downloading of any Community Model, or use of any other Community Model provided by or through LM Studio.
|
model-00001-of-00007.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:8a550dabf6e2789a9d704211d75c681a27dd9d75e037c468e6d3fe25e797dfc8
|
| 3 |
+
size 5176178595
|
model-00002-of-00007.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:05a51988a3602965ea8f21e0766240c8d890321fc3e219adcaa3d8b6108bb327
|
| 3 |
+
size 5368050997
|
model-00003-of-00007.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:0d0ed3c08f419f5c7ad90e96933948eaf4cd5d3b410dd9b2a3ffeb652ce026e0
|
| 3 |
+
size 5187037498
|
model-00004-of-00007.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:cf1c40d389c2d327844f6c7b0597d9ad519b059887c98f503ec87b3d00014375
|
| 3 |
+
size 5187300215
|
model-00005-of-00007.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:94c98e48509c361cfc93ce786d6d2b55270595e9452bd521506e9e664ff79ff6
|
| 3 |
+
size 5187300077
|
model-00006-of-00007.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b4335a3cad7c45bdc5377c7c3a1a6f31f2a1ab9a9b2022419749d8d738d36343
|
| 3 |
+
size 5368051110
|
model-00007-of-00007.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:bc77ebb277c9f56a498eee9daefbe245464569f1517408961dfbde02ab653b3a
|
| 3 |
+
size 347059898
|
model.safetensors.index.json
ADDED
|
The diff for this file is too large to render.
See raw diff
|
|
|