Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -2,35 +2,31 @@
|
|
| 2 |
library_name: transformers
|
| 3 |
license: apache-2.0
|
| 4 |
tags:
|
| 5 |
-
-
|
| 6 |
-
- kimi_k2
|
| 7 |
- text-generation
|
| 8 |
- reasoning
|
| 9 |
- agentic
|
| 10 |
- tool-calling
|
|
|
|
| 11 |
- compressed-tensors
|
| 12 |
pipeline_tag: text-generation
|
| 13 |
-
base_model: moonshotai/Kimi-K2-Thinking
|
| 14 |
---
|
| 15 |
|
| 16 |
-
# Zen Max
|
| 17 |
|
| 18 |
-
**Organization**: [Zen LM](https://zenlm.org) (Hanzo AI Γ Zoo Labs Foundation)
|
| 19 |
-
**Base Model**: Moonshot AI Kimi K2 Thinking (DeepseekV3ForCausalLM)
|
| 20 |
**Parameters**: 1.04T total (1,044B β MoE with 32B active per token)
|
| 21 |
-
**License**: Apache 2.0
|
| 22 |
-
**Context Window**: 256K tokens
|
| 23 |
-
**Thinking Capacity**: 96K-128K thinking tokens per step
|
| 24 |
-
**Architecture**:
|
| 25 |
|
| 26 |
## Model Overview
|
| 27 |
|
| 28 |
-
Zen Max is a 1T+ reasoning-first language model
|
| 29 |
|
| 30 |
Built as a **thinking agent**, Zen Max reasons step-by-step while using tools, executing **200-300 sequential tool calls** without human interference, reasoning coherently across hundreds of steps to solve complex problems.
|
| 31 |
|
| 32 |
-
> **Note**: This repository contains configuration files and documentation for Zen Max. The full model weights (~1TB) are available from the base model: [moonshotai/Kimi-K2-Thinking](https://huggingface.co/moonshotai/Kimi-K2-Thinking). Zen-specific fine-tuning instructions and adapters will be provided in future releases.
|
| 33 |
-
|
| 34 |
### Key Capabilities
|
| 35 |
|
| 36 |
#### 1. Agentic Reasoning (HLE: 44.9%)
|
|
@@ -116,9 +112,9 @@ Built as a **thinking agent**, Zen Max reasons step-by-step while using tools, e
|
|
| 116 |
|
| 117 |
## Training Approach
|
| 118 |
|
| 119 |
-
###
|
| 120 |
-
-
|
| 121 |
-
-
|
| 122 |
- Extended thinking token support
|
| 123 |
- Multi-modal reasoning capabilities
|
| 124 |
|
|
@@ -208,8 +204,8 @@ messages = [
|
|
| 208 |
]
|
| 209 |
|
| 210 |
response = model.chat(
|
| 211 |
-
tokenizer,
|
| 212 |
-
messages,
|
| 213 |
mode="heavy", # 8 parallel rollouts
|
| 214 |
thinking_budget=128000,
|
| 215 |
enable_reflection=True
|
|
@@ -249,7 +245,7 @@ tools = {
|
|
| 249 |
- **Budget Setup**: 1x 24GB GPU + 256GB RAM (~1-2 tokens/s)
|
| 250 |
- **High Performance**: 4x A100 80GB or 8x A100 40GB
|
| 251 |
|
| 252 |
-
### Alternative: GGUF Quantizations
|
| 253 |
- **1.66-bit (UD-TQ1_0)**: 245GB - fits on 247GB combined RAM+VRAM
|
| 254 |
- **2.71-bit (UD-Q2_K_XL)**: 381GB - recommended for accuracy
|
| 255 |
- **4.5-bit (UD-Q4_K_XL)**: 588GB - near full precision
|
|
@@ -263,13 +259,13 @@ tools = {
|
|
| 263 |
## Format Availability
|
| 264 |
|
| 265 |
### Current
|
| 266 |
-
-
|
| 267 |
-
-
|
| 268 |
|
| 269 |
### Coming Soon
|
| 270 |
-
-
|
| 271 |
-
-
|
| 272 |
-
-
|
| 273 |
|
| 274 |
## Special Features
|
| 275 |
|
|
@@ -305,8 +301,7 @@ tools = {
|
|
| 305 |
|
| 306 |
## Training Data
|
| 307 |
|
| 308 |
-
- **
|
| 309 |
-
- **Zen Fine-Tuning**:
|
| 310 |
- Zoo-Gym framework with RAIS technology
|
| 311 |
- Constitutional AI alignment data
|
| 312 |
- Multi-turn tool-calling trajectories
|
|
@@ -320,26 +315,23 @@ tools = {
|
|
| 320 |
title={Zen Max: Reasoning-First Language Model with Test-Time Scaling},
|
| 321 |
author={Hanzo AI and Zoo Labs Foundation},
|
| 322 |
year={2025},
|
| 323 |
-
url={https://zenlm.org}
|
| 324 |
-
note={Based on Moonshot AI Kimi K2 Thinking architecture}
|
| 325 |
}
|
| 326 |
```
|
| 327 |
|
| 328 |
## Acknowledgments
|
| 329 |
|
| 330 |
-
- **Moonshot AI**: K2 Thinking architecture and training methodology
|
| 331 |
- **Hanzo AI**: Constitutional AI training and Zen identity
|
| 332 |
- **Zoo Labs Foundation**: Open AI research and community governance
|
| 333 |
|
| 334 |
## Links
|
| 335 |
|
| 336 |
- **Website**: https://zenlm.org
|
| 337 |
-
- **
|
| 338 |
-
- **
|
| 339 |
-
- **
|
| 340 |
-
- **K2 Thinking**: https://platform.moonshot.cn/docs/intro#kimi-k2-thinking
|
| 341 |
|
| 342 |
---
|
| 343 |
|
| 344 |
-
**Zen AI**: Clarity Through Intelligence
|
| 345 |
*Now with reasoning at test-time*
|
|
|
|
| 2 |
library_name: transformers
|
| 3 |
license: apache-2.0
|
| 4 |
tags:
|
| 5 |
+
- zen
|
|
|
|
| 6 |
- text-generation
|
| 7 |
- reasoning
|
| 8 |
- agentic
|
| 9 |
- tool-calling
|
| 10 |
+
- moe
|
| 11 |
- compressed-tensors
|
| 12 |
pipeline_tag: text-generation
|
|
|
|
| 13 |
---
|
| 14 |
|
| 15 |
+
# Zen Max
|
| 16 |
|
| 17 |
+
**Organization**: [Zen LM](https://zenlm.org) (Hanzo AI Γ Zoo Labs Foundation)
|
|
|
|
| 18 |
**Parameters**: 1.04T total (1,044B β MoE with 32B active per token)
|
| 19 |
+
**License**: Apache 2.0
|
| 20 |
+
**Context Window**: 256K tokens
|
| 21 |
+
**Thinking Capacity**: 96K-128K thinking tokens per step
|
| 22 |
+
**Architecture**: MoE (Mixture of Experts)
|
| 23 |
|
| 24 |
## Model Overview
|
| 25 |
|
| 26 |
+
Zen Max is the largest model in the Zen family β a 1T+ reasoning-first language model designed for **test-time scaling** through extended thinking and tool-calling capabilities.
|
| 27 |
|
| 28 |
Built as a **thinking agent**, Zen Max reasons step-by-step while using tools, executing **200-300 sequential tool calls** without human interference, reasoning coherently across hundreds of steps to solve complex problems.
|
| 29 |
|
|
|
|
|
|
|
| 30 |
### Key Capabilities
|
| 31 |
|
| 32 |
#### 1. Agentic Reasoning (HLE: 44.9%)
|
|
|
|
| 112 |
|
| 113 |
## Training Approach
|
| 114 |
|
| 115 |
+
### Architecture
|
| 116 |
+
- 1.04T parameter Mixture of Experts
|
| 117 |
+
- 32B active parameters per token
|
| 118 |
- Extended thinking token support
|
| 119 |
- Multi-modal reasoning capabilities
|
| 120 |
|
|
|
|
| 204 |
]
|
| 205 |
|
| 206 |
response = model.chat(
|
| 207 |
+
tokenizer,
|
| 208 |
+
messages,
|
| 209 |
mode="heavy", # 8 parallel rollouts
|
| 210 |
thinking_budget=128000,
|
| 211 |
enable_reflection=True
|
|
|
|
| 245 |
- **Budget Setup**: 1x 24GB GPU + 256GB RAM (~1-2 tokens/s)
|
| 246 |
- **High Performance**: 4x A100 80GB or 8x A100 40GB
|
| 247 |
|
| 248 |
+
### Alternative: GGUF Quantizations
|
| 249 |
- **1.66-bit (UD-TQ1_0)**: 245GB - fits on 247GB combined RAM+VRAM
|
| 250 |
- **2.71-bit (UD-Q2_K_XL)**: 381GB - recommended for accuracy
|
| 251 |
- **4.5-bit (UD-Q4_K_XL)**: 588GB - near full precision
|
|
|
|
| 259 |
## Format Availability
|
| 260 |
|
| 261 |
### Current
|
| 262 |
+
- SafeTensors (BF16, full precision)
|
| 263 |
+
- INT4 Quantized (native QAT)
|
| 264 |
|
| 265 |
### Coming Soon
|
| 266 |
+
- GGUF quantizations (Q4_K_M, Q5_K_M, Q8_0)
|
| 267 |
+
- MLX optimized formats (4-bit, 8-bit for Apple Silicon)
|
| 268 |
+
- ONNX export for edge deployment
|
| 269 |
|
| 270 |
## Special Features
|
| 271 |
|
|
|
|
| 301 |
|
| 302 |
## Training Data
|
| 303 |
|
| 304 |
+
- **Zen Fine-Tuning**:
|
|
|
|
| 305 |
- Zoo-Gym framework with RAIS technology
|
| 306 |
- Constitutional AI alignment data
|
| 307 |
- Multi-turn tool-calling trajectories
|
|
|
|
| 315 |
title={Zen Max: Reasoning-First Language Model with Test-Time Scaling},
|
| 316 |
author={Hanzo AI and Zoo Labs Foundation},
|
| 317 |
year={2025},
|
| 318 |
+
url={https://zenlm.org}
|
|
|
|
| 319 |
}
|
| 320 |
```
|
| 321 |
|
| 322 |
## Acknowledgments
|
| 323 |
|
|
|
|
| 324 |
- **Hanzo AI**: Constitutional AI training and Zen identity
|
| 325 |
- **Zoo Labs Foundation**: Open AI research and community governance
|
| 326 |
|
| 327 |
## Links
|
| 328 |
|
| 329 |
- **Website**: https://zenlm.org
|
| 330 |
+
- **API**: https://api.hanzo.ai/v1
|
| 331 |
+
- **HuggingFace**: https://huggingface.co/zenlm
|
| 332 |
+
- **GitHub**: https://github.com/zenlm
|
|
|
|
| 333 |
|
| 334 |
---
|
| 335 |
|
| 336 |
+
**Zen AI**: Clarity Through Intelligence
|
| 337 |
*Now with reasoning at test-time*
|