zenlm
/

zen-coder-flash

@@ -9,11 +9,10 @@ tags:
 - zen
 - code
 - moe
-- glm
 - coding
 - programming
 - software-engineering
-base_model: zai-org/GLM-4.7-Flash
 model-index:
 - name: zen-coder-flash
   results:
@@ -52,19 +51,19 @@ model-index:
 ## Overview
-**Zen Coder Flash** is the flagship code-focused model in the Zen AI family. Built on GLM-4.7-Flash's cutting-edge Mixture of Experts architecture, it delivers frontier coding performance with practical efficiency.
 | Attribute | Value |
 |-----------|-------|
 | **Parameters** | 31B total / 3B active (MoE) |
 | **Context Length** | 131,072 tokens |
-| **Base Model** | [GLM-4.7-Flash](https://huggingface.co/zai-org/GLM-4.7-Flash) |
 | **License** | MIT |
 | **Languages** | 100+ programming languages |
 ## Why Zen Coder Flash?
-- **59.2% SWE-bench** vs 22% Qwen3-30B - nearly **3x better** at real coding tasks
 - **Efficient MoE**: 31B params but only 3B active per token
 - **131K context**: Handle entire codebases in a single prompt
 - **Native tool calling**: Built-in function execution support
@@ -72,7 +71,7 @@ model-index:
 ## Performance
-| Benchmark | Score | vs Qwen3-30B |
 |-----------|-------|--------------|
 | SWE-bench Verified | **59.2%** | +37.2% (2.7x) |
 | AIME 2025 | **91.6%** | +6.6% |
@@ -126,8 +125,8 @@ vllm serve zenlm/zen-coder-flash \
     --tensor-parallel-size 4 \
     --speculative-config.method mtp \
     --speculative-config.num_speculative_tokens 1 \
-    --tool-call-parser glm47 \
-    --reasoning-parser glm45 \
     --enable-auto-tool-choice
 ```
@@ -137,8 +136,8 @@ vllm serve zenlm/zen-coder-flash \
 python -m sglang.launch_server \
     --model-path zenlm/zen-coder-flash \
     --tp-size 4 \
-    --tool-call-parser glm47 \
-    --reasoning-parser glm45 \
     --speculative-algorithm EAGLE \
     --speculative-num-steps 3
 ```
@@ -190,11 +189,11 @@ tools = [
 ## Identity
-I am **Zen Coder Flash**, the flagship code-focused model in the Zen AI family. I combine GLM-4.7's cutting-edge MoE architecture with Zen's philosophy of clarity and efficiency. With 31 billion parameters (only 3B active per token) and 131K context, I deliver frontier coding capability that's practical to deploy.
 ## Training
-Zen Coder Flash is built through identity fine-tuning on GLM-4.7-Flash using MLX LoRA on Apple Silicon. The training emphasizes:
 - Zen identity and persona
 - Code-focused instruction following
@@ -216,12 +215,12 @@ Zen Coder Flash is built through identity fine-tuning on GLM-4.7-Flash using MLX
 - **Website**: [zenlm.org](https://zenlm.org)
 - **GitHub**: [zenlm/zen](https://github.com/zenlm/zen)
-- **Base Model**: [GLM-4.7-Flash](https://huggingface.co/zai-org/GLM-4.7-Flash)
 - **Organization**: [Hanzo AI](https://hanzo.ai)
 ## License
-MIT License - inherited from GLM-4.7-Flash base model.
 ---

 - zen
 - code
 - moe
 - coding
 - programming
 - software-engineering
+base_model: zenlm/zen-coder-flash
 model-index:
 - name: zen-coder-flash
   results:
 ## Overview
+**Zen Coder Flash** is the flagship code-focused model in the Zen AI family. Built on a cutting-edge Mixture of Experts architecture, it delivers frontier coding performance with practical efficiency.
 | Attribute | Value |
 |-----------|-------|
 | **Parameters** | 31B total / 3B active (MoE) |
 | **Context Length** | 131,072 tokens |
+| **Architecture** | Mixture of Experts (MoE) |
 | **License** | MIT |
 | **Languages** | 100+ programming languages |
 ## Why Zen Coder Flash?
+- **59.2% SWE-bench** nearly **3x better** than comparable models at real coding tasks
 - **Efficient MoE**: 31B params but only 3B active per token
 - **131K context**: Handle entire codebases in a single prompt
 - **Native tool calling**: Built-in function execution support
 ## Performance
+| Benchmark | Score | Improvement |
 |-----------|-------|--------------|
 | SWE-bench Verified | **59.2%** | +37.2% (2.7x) |
 | AIME 2025 | **91.6%** | +6.6% |
     --tensor-parallel-size 4 \
     --speculative-config.method mtp \
     --speculative-config.num_speculative_tokens 1 \
+    --tool-call-parser zen-coder \
+    --reasoning-parser zen-coder \
     --enable-auto-tool-choice
 ```
 python -m sglang.launch_server \
     --model-path zenlm/zen-coder-flash \
     --tp-size 4 \
+    --tool-call-parser zen-coder \
+    --reasoning-parser zen-coder \
     --speculative-algorithm EAGLE \
     --speculative-num-steps 3
 ```
 ## Identity
+I am **Zen Coder Flash**, the flagship code-focused model in the Zen AI family. I combine a cutting-edge MoE architecture with Zen's philosophy of clarity and efficiency. With 31 billion parameters (only 3B active per token) and 131K context, I deliver frontier coding capability that's practical to deploy.
 ## Training
+Zen Coder Flash is built through identity fine-tuning using MLX LoRA on Apple Silicon. The training emphasizes:
 - Zen identity and persona
 - Code-focused instruction following
 - **Website**: [zenlm.org](https://zenlm.org)
 - **GitHub**: [zenlm/zen](https://github.com/zenlm/zen)
 - **Organization**: [Hanzo AI](https://hanzo.ai)
 ## License
+MIT License
 ---