zeekay commited on
Commit
2f2e398
Β·
verified Β·
1 Parent(s): 0c3af94

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +26 -34
README.md CHANGED
@@ -2,35 +2,31 @@
2
  library_name: transformers
3
  license: apache-2.0
4
  tags:
5
- - deepseek
6
- - kimi_k2
7
  - text-generation
8
  - reasoning
9
  - agentic
10
  - tool-calling
 
11
  - compressed-tensors
12
  pipeline_tag: text-generation
13
- base_model: moonshotai/Kimi-K2-Thinking
14
  ---
15
 
16
- # Zen Max - Kimi K2 Thinking Architecture
17
 
18
- **Organization**: [Zen LM](https://zenlm.org) (Hanzo AI Γ— Zoo Labs Foundation)
19
- **Base Model**: Moonshot AI Kimi K2 Thinking (DeepseekV3ForCausalLM)
20
  **Parameters**: 1.04T total (1,044B β€” MoE with 32B active per token)
21
- **License**: Apache 2.0
22
- **Context Window**: 256K tokens
23
- **Thinking Capacity**: 96K-128K thinking tokens per step
24
- **Architecture**: DeepseekV3 MoE (Mixture of Experts)
25
 
26
  ## Model Overview
27
 
28
- Zen Max is a 1T+ reasoning-first language model built on Moonshot AI's Kimi K2 Thinking architecture, designed for **test-time scaling** through extended thinking and tool-calling capabilities.
29
 
30
  Built as a **thinking agent**, Zen Max reasons step-by-step while using tools, executing **200-300 sequential tool calls** without human interference, reasoning coherently across hundreds of steps to solve complex problems.
31
 
32
- > **Note**: This repository contains configuration files and documentation for Zen Max. The full model weights (~1TB) are available from the base model: [moonshotai/Kimi-K2-Thinking](https://huggingface.co/moonshotai/Kimi-K2-Thinking). Zen-specific fine-tuning instructions and adapters will be provided in future releases.
33
-
34
  ### Key Capabilities
35
 
36
  #### 1. Agentic Reasoning (HLE: 44.9%)
@@ -116,9 +112,9 @@ Built as a **thinking agent**, Zen Max reasons step-by-step while using tools, e
116
 
117
  ## Training Approach
118
 
119
- ### Base Architecture
120
- - Kimi K2 Thinking foundation
121
- - Mixture of Experts (MoE) components
122
  - Extended thinking token support
123
  - Multi-modal reasoning capabilities
124
 
@@ -208,8 +204,8 @@ messages = [
208
  ]
209
 
210
  response = model.chat(
211
- tokenizer,
212
- messages,
213
  mode="heavy", # 8 parallel rollouts
214
  thinking_budget=128000,
215
  enable_reflection=True
@@ -249,7 +245,7 @@ tools = {
249
  - **Budget Setup**: 1x 24GB GPU + 256GB RAM (~1-2 tokens/s)
250
  - **High Performance**: 4x A100 80GB or 8x A100 40GB
251
 
252
- ### Alternative: GGUF Quantizations (Unsloth)
253
  - **1.66-bit (UD-TQ1_0)**: 245GB - fits on 247GB combined RAM+VRAM
254
  - **2.71-bit (UD-Q2_K_XL)**: 381GB - recommended for accuracy
255
  - **4.5-bit (UD-Q4_K_XL)**: 588GB - near full precision
@@ -263,13 +259,13 @@ tools = {
263
  ## Format Availability
264
 
265
  ### Current
266
- - βœ… SafeTensors (BF16, full precision)
267
- - βœ… INT4 Quantized (native QAT)
268
 
269
  ### Coming Soon
270
- - πŸ”„ GGUF quantizations (Q4_K_M, Q5_K_M, Q8_0)
271
- - πŸ”„ MLX optimized formats (4-bit, 8-bit for Apple Silicon)
272
- - πŸ”„ ONNX export for edge deployment
273
 
274
  ## Special Features
275
 
@@ -305,8 +301,7 @@ tools = {
305
 
306
  ## Training Data
307
 
308
- - **Base Training**: Kimi K2 Thinking pre-training corpus
309
- - **Zen Fine-Tuning**:
310
  - Zoo-Gym framework with RAIS technology
311
  - Constitutional AI alignment data
312
  - Multi-turn tool-calling trajectories
@@ -320,26 +315,23 @@ tools = {
320
  title={Zen Max: Reasoning-First Language Model with Test-Time Scaling},
321
  author={Hanzo AI and Zoo Labs Foundation},
322
  year={2025},
323
- url={https://zenlm.org},
324
- note={Based on Moonshot AI Kimi K2 Thinking architecture}
325
  }
326
  ```
327
 
328
  ## Acknowledgments
329
 
330
- - **Moonshot AI**: K2 Thinking architecture and training methodology
331
  - **Hanzo AI**: Constitutional AI training and Zen identity
332
  - **Zoo Labs Foundation**: Open AI research and community governance
333
 
334
  ## Links
335
 
336
  - **Website**: https://zenlm.org
337
- - **HuggingFace**: https://huggingface.co/zenlm/zen-max
338
- - **GitHub**: https://github.com/zenlm/zen
339
- - **Moonshot AI**: https://www.moonshot.cn/
340
- - **K2 Thinking**: https://platform.moonshot.cn/docs/intro#kimi-k2-thinking
341
 
342
  ---
343
 
344
- **Zen AI**: Clarity Through Intelligence
345
  *Now with reasoning at test-time*
 
2
  library_name: transformers
3
  license: apache-2.0
4
  tags:
5
+ - zen
 
6
  - text-generation
7
  - reasoning
8
  - agentic
9
  - tool-calling
10
+ - moe
11
  - compressed-tensors
12
  pipeline_tag: text-generation
 
13
  ---
14
 
15
+ # Zen Max
16
 
17
+ **Organization**: [Zen LM](https://zenlm.org) (Hanzo AI Γ— Zoo Labs Foundation)
 
18
  **Parameters**: 1.04T total (1,044B β€” MoE with 32B active per token)
19
+ **License**: Apache 2.0
20
+ **Context Window**: 256K tokens
21
+ **Thinking Capacity**: 96K-128K thinking tokens per step
22
+ **Architecture**: MoE (Mixture of Experts)
23
 
24
  ## Model Overview
25
 
26
+ Zen Max is the largest model in the Zen family β€” a 1T+ reasoning-first language model designed for **test-time scaling** through extended thinking and tool-calling capabilities.
27
 
28
  Built as a **thinking agent**, Zen Max reasons step-by-step while using tools, executing **200-300 sequential tool calls** without human interference, reasoning coherently across hundreds of steps to solve complex problems.
29
 
 
 
30
  ### Key Capabilities
31
 
32
  #### 1. Agentic Reasoning (HLE: 44.9%)
 
112
 
113
  ## Training Approach
114
 
115
+ ### Architecture
116
+ - 1.04T parameter Mixture of Experts
117
+ - 32B active parameters per token
118
  - Extended thinking token support
119
  - Multi-modal reasoning capabilities
120
 
 
204
  ]
205
 
206
  response = model.chat(
207
+ tokenizer,
208
+ messages,
209
  mode="heavy", # 8 parallel rollouts
210
  thinking_budget=128000,
211
  enable_reflection=True
 
245
  - **Budget Setup**: 1x 24GB GPU + 256GB RAM (~1-2 tokens/s)
246
  - **High Performance**: 4x A100 80GB or 8x A100 40GB
247
 
248
+ ### Alternative: GGUF Quantizations
249
  - **1.66-bit (UD-TQ1_0)**: 245GB - fits on 247GB combined RAM+VRAM
250
  - **2.71-bit (UD-Q2_K_XL)**: 381GB - recommended for accuracy
251
  - **4.5-bit (UD-Q4_K_XL)**: 588GB - near full precision
 
259
  ## Format Availability
260
 
261
  ### Current
262
+ - SafeTensors (BF16, full precision)
263
+ - INT4 Quantized (native QAT)
264
 
265
  ### Coming Soon
266
+ - GGUF quantizations (Q4_K_M, Q5_K_M, Q8_0)
267
+ - MLX optimized formats (4-bit, 8-bit for Apple Silicon)
268
+ - ONNX export for edge deployment
269
 
270
  ## Special Features
271
 
 
301
 
302
  ## Training Data
303
 
304
+ - **Zen Fine-Tuning**:
 
305
  - Zoo-Gym framework with RAIS technology
306
  - Constitutional AI alignment data
307
  - Multi-turn tool-calling trajectories
 
315
  title={Zen Max: Reasoning-First Language Model with Test-Time Scaling},
316
  author={Hanzo AI and Zoo Labs Foundation},
317
  year={2025},
318
+ url={https://zenlm.org}
 
319
  }
320
  ```
321
 
322
  ## Acknowledgments
323
 
 
324
  - **Hanzo AI**: Constitutional AI training and Zen identity
325
  - **Zoo Labs Foundation**: Open AI research and community governance
326
 
327
  ## Links
328
 
329
  - **Website**: https://zenlm.org
330
+ - **API**: https://api.hanzo.ai/v1
331
+ - **HuggingFace**: https://huggingface.co/zenlm
332
+ - **GitHub**: https://github.com/zenlm
 
333
 
334
  ---
335
 
336
+ **Zen AI**: Clarity Through Intelligence
337
  *Now with reasoning at test-time*