metadata
language:
- en
license: mit
base_model: Qwen/Qwen2.5-Coder-7B
tags:
- zenith
- tenstorrent
- code
- reasoning
- moe
- ring-attention
- eq-adapter
- matrix-corp
pipeline_tag: text-generation
library_name: transformers
model_type: zenith
hardware:
- tenstorrent-blackhole-p300a
Zenith-7B V1
Standard GPU-optimized language model with code generation and emotional intelligence capabilities.
Features
- 7B Parameter Model: Efficient for consumer GPUs (8-16GB VRAM)
- Code Generation: Fine-tuned on Qwen2.5-Coder base for exceptional programming abilities
- Emotional Intelligence: EQ adapter for recognizing and responding to emotions
- OpenThoughts Integration: Trained on high-quality reasoning data
- LoRA/QLoRA Support: Efficient fine-tuning with 4-bit quantization
- Ollama Compatible: Ready-to-use Modelfile for easy deployment
Quick Start
Installation
# Clone and setup
cd Zenith/V1/7B
pip install -r requirements.txt
Training
# Full fine-tuning
python train.py \
--base_model Qwen/Qwen2.5-Coder-7B \
--train_data path/to/train.json \
--epochs 3 \
--batch_size 4 \
--learning_rate 2e-5
# LoRA fine-tuning (recommended for most users)
python train.py \
--base_model Qwen/Qwen2.5-Coder-7B \
--train_data path/to/train.json \
--use_lora \
--lora_r 16 \
--lora_alpha 32 \
--epochs 3 \
--batch_size 8
Inference
# Interactive mode
python inference.py --checkpoint ./outputs/checkpoint-final
# Single prompt
python inference.py \
--checkpoint ./outputs/checkpoint-final \
--prompt "Write a Python function to reverse a linked list" \
--max_new_tokens 512
Ollama Deployment
# Build and run with Ollama
ollama create zenith-7b -f Modelfile
ollama run zenith-7b "Explain quantum computing in simple terms"
Project Structure
Zenith/V1/7B/
βββ configs/ # Configuration files
β βββ zenith_config.py # Model architecture config
β βββ data_config.py # Data processing config
β βββ training_config.py # Training hyperparameters
βββ data/ # Data processing modules
β βββ openthoughts_processor.py
β βββ quality_filter.py
β βββ curriculum_sampler.py
β βββ advanced_tokenizer.py
β βββ preprocessing.py
βββ src/ # Source code
β βββ models/
β β βββ zenith_model.py
β β βββ dense_layer.py
β β βββ moe_layer.py
β βββ utils/
βββ scripts/ # Utility scripts
βββ tests/ # Test suite
βββ train.py # Main training script
βββ inference.py # Inference and generation
βββ test_model.py # Model validation tests
βββ finetune_qwen.py # Qwen fine-tuning guide
βββ Modelfile # Ollama configuration
βββ requirements.txt # Python dependencies
βββ README.md # This file
Configuration
The model uses a unified configuration system in configs/zenith_config.py:
from configs.zenith_config import get_7b_config
config = get_7b_config()
# Parameters:
# - hidden_size: 4096
# - num_layers: 32
# - num_heads: 32
# - num_experts: 0 (dense only, set >1 for MoE)
# - use_eq_adapter: True (emotional intelligence)
# - max_seq_len: 8192
Data Processing
OpenThoughts Integration
The data pipeline supports the OpenThoughts-1.2M dataset:
from data.openthoughts_processor import OpenThoughtsProcessor, OpenThoughtsConfig
config = OpenThoughtsConfig(
dataset_name="open-thoughts/OpenThoughts3-1.2M",
streaming=True,
quality_filtering=True,
curriculum_learning=True,
augmentation=True
)
processor = OpenThoughtsProcessor(config)
dataset = processor.load_dataset()
Quality Filtering
Multi-dimensional quality assessment:
- Length appropriateness
- Language detection (English only)
- Repetition detection
- Coherence scoring
- Structure validation
- Thought quality (for CoT data)
Curriculum Learning
Progressive training stages:
- Foundation: High-quality, well-structured samples
- Reasoning: Chain-of-thought and problem-solving
- Code: Programming and technical content
- Full: Complete dataset with all samples
Advanced Features
MoE (Mixture of Experts)
Enable sparse activation for better performance:
python train.py --use_moe --num_experts 8
- Top-2 routing with load balancing
- 60% of layers use MoE (middle layers)
- Shared router groups for efficiency
EQ Adapter
Emotional intelligence module:
python train.py --use_eq_adapter --eq_loss_weight 0.1
- Frustration detection (regression)
- 8-emotion classification
- Fused with attention mechanism
LoRA/QLoRA
Efficient fine-tuning with low-rank adaptation:
# LoRA
python train.py --use_lora --lora_r 16 --lora_alpha 32
# QLoRA (4-bit quantization)
python train.py --use_qlora --use_lora --lora_r 8
Testing
Run the test suite:
python test_model.py
Tests include:
- Model creation and initialization
- Forward pass and gradient flow
- Text generation
- Multi-task outputs (EQ adapter)
- Loss computation
Requirements
See requirements.txt for full dependencies. Key packages:
- torch>=2.0.0
- transformers>=4.35.0
- datasets>=2.14.0
- accelerate>=0.24.0
- peft>=0.6.0 (for LoRA)
- bitsandbytes>=0.41.0 (for QLoRA)
- tensorboard>=2.14.0
Performance Tips
- Mixed Precision: Use
--mixed_precision bf16for faster training (Ampere+ GPUs) - Gradient Checkpointing: Enabled by default to reduce memory
- Batch Size: Adjust based on VRAM (4-8 for 7B full, 16-32 for LoRA)
- Sequence Length: Longer sequences use more memory; adjust
--max_seq_length
Troubleshooting
Out of Memory
- Reduce batch size
- Use gradient accumulation
- Enable LoRA/QLoRA
- Use mixed precision
- Reduce sequence length
Slow Training
- Increase batch size if possible
- Use more gradient accumulation steps
- Ensure data loading is not the bottleneck
- Use mixed precision
Poor Quality Outputs
- Train longer (more epochs)
- Use higher quality data
- Adjust learning rate (try 1e-5 to 5e-5)
- Enable curriculum learning
- Use quality filtering
Citation
If you use Zenith-7B in your research, please cite:
@misc{zenith-7b-2025,
title={Zenith-7B: A Hybrid MoE Model for Code and Emotional Intelligence},
year={2025},
publisher={Zenith Project}
}
License
[Specify your license here]
Contact
For issues and questions, please open an issue on the project repository.