metadata
language:
- en
license: apache-2.0
tags:
- code-generation
- reasoning
- emotional-intelligence
- moe
- long-context
- open-thoughts
- zenith
- 7b
- qwen2.5-coder
datasets:
- open-thoughts/OpenThoughts3-1.2M
model-index:
- name: Zenith-7B-V1
results: []
Zenith-7B-V1
Production-ready 7B parameter model with code generation, reasoning, and emotional intelligence.
Zenith-7B is a state-of-the-art language model fine-tuned from Qwen2.5-Coder-7B, enhanced with advanced features including:
- π― Code Generation: Exceptional programming abilities across multiple languages
- π§ Reasoning: Strong performance on algorithmic and mathematical problems
- π Emotional Intelligence: EQ adapter for recognizing and responding to emotions
- π OpenThoughts Integration: Trained on high-quality reasoning data from OpenThoughts-1.2M
- π MoE Support: Optional Mixture of Experts for sparse activation
- π Long Context: 8K context window (extendable to 32K with ring attention)
- β‘ Efficient Fine-Tuning: Full support for LoRA and QLoRA
Model Details
Architecture
- Base Model: Qwen/Qwen2.5-Coder-7B
- Parameters: 7.0B (dense) or configurable with MoE
- Hidden Size: 4096
- Layers: 32
- Attention Heads: 32
- Vocabulary Size: 32,000 (configurable)
Training
- Dataset: OpenThoughts-1.2M with custom filtering and curriculum learning
- Sequence Length: Up to 8192 tokens
- Training Method: Full fine-tuning or LoRA/QLoRA
- Mixed Precision: BF16/FP16 support
Usage
Installation
pip install torch transformers accelerate peft bitsandbytes
Load Model
from transformers import AutoModelForCausalLM, AutoTokenizer
from configs.zenith_config import get_7b_config
# Load configuration
config = get_7b_config()
# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
"your-username/Zenith-7B",
config=config,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("your-username/Zenith-7B")
Inference
# Generate text
prompt = "Write a Python function to reverse a linked list:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Interactive Chat
import torch
from inference import load_model, generate
model, tokenizer = load_model("your-username/Zenith-7B")
while True:
user_input = input("\nYou: ")
if user_input.lower() == 'quit':
break
response = generate(model, tokenizer, user_input, max_new_tokens=512)
print(f"\nAssistant: {response}")
Training Your Own
See FINETUNE_GUIDE.md for comprehensive fine-tuning instructions.
Quick Start
# LoRA fine-tuning (recommended)
python finetune_qwen.py \
--base_model Qwen/Qwen2.5-Coder-7B \
--train_data ./data/train.json \
--use_lora \
--lora_r 16 \
--epochs 3 \
--batch_size 8 \
--learning_rate 1e-4
Full Training
python train.py \
--base_model Qwen/Qwen2.5-Coder-7B \
--train_data ./data/train.json \
--epochs 3 \
--batch_size 4 \
--learning_rate 2e-5 \
--use_quality_filter \
--use_curriculum
Features
1. OpenThoughts Integration
Seamless integration with OpenThoughts-1.2M dataset:
from data.openthoughts_processor import OpenThoughtsProcessor, OpenThoughtsConfig
config = OpenThoughtsConfig(
dataset_name="open-thoughts/OpenThoughts3-1.2M",
streaming=True,
quality_filtering=True,
curriculum_learning=True,
tokenizer=tokenizer
)
processor = OpenThoughtsProcessor(config)
dataset = processor.load_dataset()
2. Quality Filtering
Multi-dimensional quality assessment:
- Length appropriateness
- Language detection (English)
- Repetition detection
- Coherence scoring
- Structure validation
- Thought quality (for CoT data)
3. Curriculum Learning
Progressive training stages:
- Foundation: High-quality, well-structured samples
- Reasoning: Chain-of-thought and problem-solving
- Code: Programming and technical content
- Full: Complete dataset
4. MoE (Mixture of Experts)
Enable sparse activation for better performance:
config.num_experts = 8
config.moe_top_k = 2
- Top-2 routing with load balancing
- 60% of layers use MoE (middle layers)
- Shared router groups for efficiency
5. EQ Adapter
Emotional intelligence module:
config.use_eq_adapter = True
config.eq_loss_weight = 0.1
- Frustration detection (regression)
- 8-emotion classification
- Fused with attention mechanism
6. Ring Attention (Experimental)
For longer contexts:
config.use_ring_attention = True
config.max_seq_len = 32768
config.ring_attention_chunk_size = 8192
config.ring_attention_overlap = 2048
Evaluation
Code Generation
python -m evaluation.benchmark \
--model_path ./outputs/checkpoint-final \
--benchmarks humaneval mbpp
Reasoning
python -m evaluation.benchmark \
--model_path ./outputs/checkpoint-final \
--benchmarks gsm8k math
Emotional Intelligence
Custom evaluation with emotional benchmarks (see evaluation module).
File Structure
Zenith/V1/7B/
βββ configs/
β βββ zenith_config.py # Model configuration
β βββ data_config.py # Data processing config
β βββ training_config.py # Training hyperparameters
βββ data/
β βββ openthoughts_processor.py
β βββ quality_filter.py
β βββ curriculum_sampler.py
β βββ advanced_tokenizer.py
β βββ preprocessing.py
βββ models/
β βββ zenith_model.py
β βββ dense_layer.py
β βββ moe_layer.py
βββ utils/
β βββ checkpoint.py
β βββ logging_utils.py
β βββ metrics.py
βββ training/
β βββ trainer.py
β βββ train_7b.py
βββ tests/
β βββ evaluation/
βββ train.py # Main training script
βββ inference.py # Inference and generation
βββ test_model.py # Model validation tests
βββ finetune_qwen.py # Qwen fine-tuning guide
βββ modeling_zenith.py # Hugging Face integration
βββ Modelfile # Ollama configuration
βββ requirements.txt # Python dependencies
βββ README.md # Full documentation
βββ FINETUNE_GUIDE.md # Detailed fine-tuning guide
βββ hf_model_card.md # This file (Hugging Face model card)
Performance
Benchmarks (Expected)
| Benchmark | Zenith-7B | Qwen2.5-Coder-7B |
|---|---|---|
| HumanEval | ~75% | ~72% |
| MBPP | ~80% | ~77% |
| GSM8K | ~65% | ~60% |
| MATH | ~45% | ~42% |
Actual results may vary based on training data and fine-tuning.
Training Requirements
- Full Fine-Tuning: 16GB+ VRAM
- LoRA: 8GB+ VRAM
- QLoRA: 4GB+ VRAM
- Training Time: ~2-3 days on A100 for full dataset
Limitations
- Trained primarily on English text
- May exhibit biases present in training data
- Code generation should be reviewed for security
- Long context (32K) requires significant memory
- MoE and EQ adapters increase memory usage
Ethical Considerations
This model is intended for research and development purposes. Users should:
- Review generated code for security vulnerabilities
- Be aware of potential biases in outputs
- Use emotional intelligence features responsibly
- Comply with all applicable laws and regulations
Citation
@misc{zenith-7b-2025,
title={Zenith-7B: A Hybrid MoE Model for Code and Emotional Intelligence},
author={Zenith Project},
year={2025},
publisher={Zenith Project},
license={Apache-2.0}
}
License
Apache 2.0
Contact
For issues and questions:
- Open an issue on Hugging Face
- Check documentation in README.md and FINETUNE_GUIDE.md
Acknowledgments
- Base model: Qwen2.5-Coder-7B
- Dataset: OpenThoughts3-1.2M
- Framework: Hugging Face Transformers