Zen-Next (80B)
Part of the Zen AI Model Family
Model Description
Parameters: 80B
Architecture: Zen
Specialization: Complex reasoning & extended context
Training: Flagship training with constitutional AI
Context: 32K-128K tokens
Thinking: Up to 1,000,000 tokens
Files in This Repository
This repository contains ALL formats and quantizations:
π· SafeTensors (Original)
model.safetensors - Full precision weights
config.json - Model configuration
tokenizer.json - Fast tokenizer
π’ GGUF Quantized
zen-next-80b-instruct-Q4_K_M.gguf - 4-bit (recommended)
zen-next-80b-instruct-Q5_K_M.gguf - 5-bit (balanced)
zen-next-80b-instruct-Q8_0.gguf - 8-bit (high quality)
π MLX (Apple Silicon)
mlx-4bit/ - 4-bit quantized for M-series
mlx-8bit/ - 8-bit quantized for M-series
Performance
| Benchmark |
Score |
Rank |
| MMLU |
75.6% |
Top 10% |
| GSM8K |
82.1% |
Top 15% |
| HumanEval |
61.7% |
Top 20% |
Quick Start
Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("zenlm/zen-next-80b-instruct")
tokenizer = AutoTokenizer.from_pretrained("zenlm/zen-next-80b-instruct")
messages = [{"role": "user", "content": "Your question here"}]
text = tokenizer.apply_chat_template(messages, enable_thinking=True)
GGUF with llama.cpp
./main -m zen-next-80b-instruct-Q4_K_M.gguf -p "Your prompt" -n 512
MLX for Apple Silicon
from mlx_lm import load, generate
model, tokenizer = load("zenlm/zen-next-80b-instruct")
response = generate(model, tokenizer, "Your prompt", max_tokens=200)
Unique Training Background
Flagship training with constitutional AI
This model was specifically optimized for complex reasoning & extended context with careful attention to:
- Inference efficiency
- Memory footprint
- Quality preservation
- Thinking capabilities
Part of the Zen Family β’ Collection β’ GitHub