File size: 6,745 Bytes
2eb1da9 8d18b7c | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 | ---
language:
- en
license: mit
base_model: Qwen/Qwen2.5-Coder-7B
tags:
- zenith
- tenstorrent
- code
- reasoning
- moe
- ring-attention
- eq-adapter
- matrix-corp
pipeline_tag: text-generation
library_name: transformers
model_type: zenith
hardware:
- tenstorrent-blackhole-p300a
---
# Zenith-7B V1
Standard GPU-optimized language model with code generation and emotional intelligence capabilities.
## Features
- **7B Parameter Model**: Efficient for consumer GPUs (8-16GB VRAM)
- **Code Generation**: Fine-tuned on Qwen2.5-Coder base for exceptional programming abilities
- **Emotional Intelligence**: EQ adapter for recognizing and responding to emotions
- **OpenThoughts Integration**: Trained on high-quality reasoning data
- **LoRA/QLoRA Support**: Efficient fine-tuning with 4-bit quantization
- **Ollama Compatible**: Ready-to-use Modelfile for easy deployment
## Quick Start
### Installation
```bash
# Clone and setup
cd Zenith/V1/7B
pip install -r requirements.txt
```
### Training
```bash
# Full fine-tuning
python train.py \
--base_model Qwen/Qwen2.5-Coder-7B \
--train_data path/to/train.json \
--epochs 3 \
--batch_size 4 \
--learning_rate 2e-5
# LoRA fine-tuning (recommended for most users)
python train.py \
--base_model Qwen/Qwen2.5-Coder-7B \
--train_data path/to/train.json \
--use_lora \
--lora_r 16 \
--lora_alpha 32 \
--epochs 3 \
--batch_size 8
```
### Inference
```bash
# Interactive mode
python inference.py --checkpoint ./outputs/checkpoint-final
# Single prompt
python inference.py \
--checkpoint ./outputs/checkpoint-final \
--prompt "Write a Python function to reverse a linked list" \
--max_new_tokens 512
```
### Ollama Deployment
```bash
# Build and run with Ollama
ollama create zenith-7b -f Modelfile
ollama run zenith-7b "Explain quantum computing in simple terms"
```
## Project Structure
```
Zenith/V1/7B/
βββ configs/ # Configuration files
β βββ zenith_config.py # Model architecture config
β βββ data_config.py # Data processing config
β βββ training_config.py # Training hyperparameters
βββ data/ # Data processing modules
β βββ openthoughts_processor.py
β βββ quality_filter.py
β βββ curriculum_sampler.py
β βββ advanced_tokenizer.py
β βββ preprocessing.py
βββ src/ # Source code
β βββ models/
β β βββ zenith_model.py
β β βββ dense_layer.py
β β βββ moe_layer.py
β βββ utils/
βββ scripts/ # Utility scripts
βββ tests/ # Test suite
βββ train.py # Main training script
βββ inference.py # Inference and generation
βββ test_model.py # Model validation tests
βββ finetune_qwen.py # Qwen fine-tuning guide
βββ Modelfile # Ollama configuration
βββ requirements.txt # Python dependencies
βββ README.md # This file
```
## Configuration
The model uses a unified configuration system in `configs/zenith_config.py`:
```python
from configs.zenith_config import get_7b_config
config = get_7b_config()
# Parameters:
# - hidden_size: 4096
# - num_layers: 32
# - num_heads: 32
# - num_experts: 0 (dense only, set >1 for MoE)
# - use_eq_adapter: True (emotional intelligence)
# - max_seq_len: 8192
```
## Data Processing
### OpenThoughts Integration
The data pipeline supports the OpenThoughts-1.2M dataset:
```python
from data.openthoughts_processor import OpenThoughtsProcessor, OpenThoughtsConfig
config = OpenThoughtsConfig(
dataset_name="open-thoughts/OpenThoughts3-1.2M",
streaming=True,
quality_filtering=True,
curriculum_learning=True,
augmentation=True
)
processor = OpenThoughtsProcessor(config)
dataset = processor.load_dataset()
```
### Quality Filtering
Multi-dimensional quality assessment:
- Length appropriateness
- Language detection (English only)
- Repetition detection
- Coherence scoring
- Structure validation
- Thought quality (for CoT data)
### Curriculum Learning
Progressive training stages:
1. **Foundation**: High-quality, well-structured samples
2. **Reasoning**: Chain-of-thought and problem-solving
3. **Code**: Programming and technical content
4. **Full**: Complete dataset with all samples
## Advanced Features
### MoE (Mixture of Experts)
Enable sparse activation for better performance:
```bash
python train.py --use_moe --num_experts 8
```
- Top-2 routing with load balancing
- 60% of layers use MoE (middle layers)
- Shared router groups for efficiency
### EQ Adapter
Emotional intelligence module:
```bash
python train.py --use_eq_adapter --eq_loss_weight 0.1
```
- Frustration detection (regression)
- 8-emotion classification
- Fused with attention mechanism
### LoRA/QLoRA
Efficient fine-tuning with low-rank adaptation:
```bash
# LoRA
python train.py --use_lora --lora_r 16 --lora_alpha 32
# QLoRA (4-bit quantization)
python train.py --use_qlora --use_lora --lora_r 8
```
## Testing
Run the test suite:
```bash
python test_model.py
```
Tests include:
- Model creation and initialization
- Forward pass and gradient flow
- Text generation
- Multi-task outputs (EQ adapter)
- Loss computation
## Requirements
See `requirements.txt` for full dependencies. Key packages:
- torch>=2.0.0
- transformers>=4.35.0
- datasets>=2.14.0
- accelerate>=0.24.0
- peft>=0.6.0 (for LoRA)
- bitsandbytes>=0.41.0 (for QLoRA)
- tensorboard>=2.14.0
## Performance Tips
1. **Mixed Precision**: Use `--mixed_precision bf16` for faster training (Ampere+ GPUs)
2. **Gradient Checkpointing**: Enabled by default to reduce memory
3. **Batch Size**: Adjust based on VRAM (4-8 for 7B full, 16-32 for LoRA)
4. **Sequence Length**: Longer sequences use more memory; adjust `--max_seq_length`
## Troubleshooting
### Out of Memory
- Reduce batch size
- Use gradient accumulation
- Enable LoRA/QLoRA
- Use mixed precision
- Reduce sequence length
### Slow Training
- Increase batch size if possible
- Use more gradient accumulation steps
- Ensure data loading is not the bottleneck
- Use mixed precision
### Poor Quality Outputs
- Train longer (more epochs)
- Use higher quality data
- Adjust learning rate (try 1e-5 to 5e-5)
- Enable curriculum learning
- Use quality filtering
## Citation
If you use Zenith-7B in your research, please cite:
```bibtex
@misc{zenith-7b-2025,
title={Zenith-7B: A Hybrid MoE Model for Code and Emotional Intelligence},
year={2025},
publisher={Zenith Project}
}
```
## License
[Specify your license here]
## Contact
For issues and questions, please open an issue on the project repository. |