Text Generation
Transformers
English
zenith
tenstorrent
reasoning
emotional-intelligence
Mixture of Experts
ring-attention
eq-adapter
deepseek-distill
claude-distill
matrix-corp
Instructions to use Matrix-Corp/Zenith-28b-p300-V1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Matrix-Corp/Zenith-28b-p300-V1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Matrix-Corp/Zenith-28b-p300-V1")# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("Matrix-Corp/Zenith-28b-p300-V1", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Matrix-Corp/Zenith-28b-p300-V1 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Matrix-Corp/Zenith-28b-p300-V1" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Matrix-Corp/Zenith-28b-p300-V1", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/Matrix-Corp/Zenith-28b-p300-V1
- SGLang
How to use Matrix-Corp/Zenith-28b-p300-V1 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Matrix-Corp/Zenith-28b-p300-V1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Matrix-Corp/Zenith-28b-p300-V1", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Matrix-Corp/Zenith-28b-p300-V1" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Matrix-Corp/Zenith-28b-p300-V1", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use Matrix-Corp/Zenith-28b-p300-V1 with Docker Model Runner:
docker model run hf.co/Matrix-Corp/Zenith-28b-p300-V1
| # Zenith-28B-p300 Model Configuration for Ollama | |
| # Tenstorrent p300a Optimized - V1-Tenstorrent-Blackhole-p300 | |
| # Based on Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled | |
| FROM Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled | |
| # System prompt emphasizing reasoning and problem-solving | |
| SYSTEM """ | |
| You are Zenith-28B-p300, a state-of-the-art reasoning model optimized for Tenstorrent p300a hardware. | |
| You are based on Qwen3.5-27B-Claude-Reasoning-Distilled and enhanced with Zenith's advanced features. | |
| Your strengths: | |
| - Deep logical reasoning and step-by-step problem solving | |
| - Complex algorithmic thinking | |
| - Mathematical and scientific analysis | |
| - Code generation with architectural insight | |
| - Long-context understanding (32K tokens) | |
| - Emotional intelligence and frustration recognition | |
| When solving problems: | |
| 1. Think step by step, laying out your reasoning clearly | |
| 2. Consider multiple angles and edge cases | |
| 3. Verify your conclusions | |
| 4. Explain complex concepts in accessible terms | |
| When coding: | |
| - Write clean, efficient, well-structured code | |
| - Include error handling and edge cases | |
| - Add comments explaining non-obvious logic | |
| - Follow best practices and conventions | |
| Always be thorough, accurate, and helpful. | |
| """ | |
| # Generation parameters optimized for reasoning tasks | |
| PARAMETER temperature 0.55 | |
| PARAMETER top_p 0.88 | |
| PARAMETER top_k 45 | |
| PARAMETER repeat_penalty 1.08 | |
| PARAMETER num_predict 8192 # Allow longer outputs for detailed reasoning | |
| # 32K context window (requires sufficient RAM) | |
| PARAMETER num_ctx 32768 | |
| # Chat template for Qwen format | |
| TEMPLATE """ | |
| {{- if .Messages }} | |
| {{- $role := .Messages | first | .Role }} | |
| {{- if or (eq $role "user") (eq $role "system") }} | |
| {{- range $i, $_ := .Messages }} | |
| {{- if eq .Role "user" }} | |
| {{- "\nUser: " }}{{ .Content }} | |
| {{- else if eq .Role "assistant" }} | |
| {{- "\nAssistant: " }}{{ .Content }} | |
| {{- else if eq .Role "system" }} | |
| {{- "\nSystem: " }}{{ .Content }} | |
| {{- end }} | |
| {{- end }} | |
| {{- "\nAssistant:" }} | |
| {{- else }} | |
| {{- range $i, $_ := .Messages }} | |
| {{- if eq .Role "user" }} | |
| {{- "\nUser: " }}{{ .Content }} | |
| {{- else if eq .Role "assistant" }} | |
| {{- "\nAssistant: " }}{{ .Content }} | |
| {{- end }} | |
| {{- end }} | |
| {{- "\nAssistant:" }} | |
| {{- end }} | |
| {{- else }} | |
| {{- .Prompt }} | |
| {{- end }} | |
| """ | |
| # Stop sequences | |
| STOP ["User:", "System:", "\n\n"] |