Zenith-7b-V1 / hf_model_card.md
Zandy-Wandy's picture
Upload Zenith-7B model
1ea8a03 verified
metadata
language:
  - en
license: apache-2.0
tags:
  - code-generation
  - reasoning
  - emotional-intelligence
  - moe
  - long-context
  - open-thoughts
  - zenith
  - 7b
  - qwen2.5-coder
datasets:
  - open-thoughts/OpenThoughts3-1.2M
model-index:
  - name: Zenith-7B-V1
    results: []

Zenith-7B-V1

Production-ready 7B parameter model with code generation, reasoning, and emotional intelligence.

Zenith-7B is a state-of-the-art language model fine-tuned from Qwen2.5-Coder-7B, enhanced with advanced features including:

  • 🎯 Code Generation: Exceptional programming abilities across multiple languages
  • 🧠 Reasoning: Strong performance on algorithmic and mathematical problems
  • πŸ’– Emotional Intelligence: EQ adapter for recognizing and responding to emotions
  • πŸ“š OpenThoughts Integration: Trained on high-quality reasoning data from OpenThoughts-1.2M
  • πŸ”„ MoE Support: Optional Mixture of Experts for sparse activation
  • πŸ“ Long Context: 8K context window (extendable to 32K with ring attention)
  • ⚑ Efficient Fine-Tuning: Full support for LoRA and QLoRA

Model Details

Architecture

  • Base Model: Qwen/Qwen2.5-Coder-7B
  • Parameters: 7.0B (dense) or configurable with MoE
  • Hidden Size: 4096
  • Layers: 32
  • Attention Heads: 32
  • Vocabulary Size: 32,000 (configurable)

Training

  • Dataset: OpenThoughts-1.2M with custom filtering and curriculum learning
  • Sequence Length: Up to 8192 tokens
  • Training Method: Full fine-tuning or LoRA/QLoRA
  • Mixed Precision: BF16/FP16 support

Usage

Installation

pip install torch transformers accelerate peft bitsandbytes

Load Model

from transformers import AutoModelForCausalLM, AutoTokenizer
from configs.zenith_config import get_7b_config

# Load configuration
config = get_7b_config()

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
    "your-username/Zenith-7B",
    config=config,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("your-username/Zenith-7B")

Inference

# Generate text
prompt = "Write a Python function to reverse a linked list:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Interactive Chat

import torch
from inference import load_model, generate

model, tokenizer = load_model("your-username/Zenith-7B")

while True:
    user_input = input("\nYou: ")
    if user_input.lower() == 'quit':
        break
    response = generate(model, tokenizer, user_input, max_new_tokens=512)
    print(f"\nAssistant: {response}")

Training Your Own

See FINETUNE_GUIDE.md for comprehensive fine-tuning instructions.

Quick Start

# LoRA fine-tuning (recommended)
python finetune_qwen.py \
  --base_model Qwen/Qwen2.5-Coder-7B \
  --train_data ./data/train.json \
  --use_lora \
  --lora_r 16 \
  --epochs 3 \
  --batch_size 8 \
  --learning_rate 1e-4

Full Training

python train.py \
  --base_model Qwen/Qwen2.5-Coder-7B \
  --train_data ./data/train.json \
  --epochs 3 \
  --batch_size 4 \
  --learning_rate 2e-5 \
  --use_quality_filter \
  --use_curriculum

Features

1. OpenThoughts Integration

Seamless integration with OpenThoughts-1.2M dataset:

from data.openthoughts_processor import OpenThoughtsProcessor, OpenThoughtsConfig

config = OpenThoughtsConfig(
    dataset_name="open-thoughts/OpenThoughts3-1.2M",
    streaming=True,
    quality_filtering=True,
    curriculum_learning=True,
    tokenizer=tokenizer
)
processor = OpenThoughtsProcessor(config)
dataset = processor.load_dataset()

2. Quality Filtering

Multi-dimensional quality assessment:

  • Length appropriateness
  • Language detection (English)
  • Repetition detection
  • Coherence scoring
  • Structure validation
  • Thought quality (for CoT data)

3. Curriculum Learning

Progressive training stages:

  1. Foundation: High-quality, well-structured samples
  2. Reasoning: Chain-of-thought and problem-solving
  3. Code: Programming and technical content
  4. Full: Complete dataset

4. MoE (Mixture of Experts)

Enable sparse activation for better performance:

config.num_experts = 8
config.moe_top_k = 2
  • Top-2 routing with load balancing
  • 60% of layers use MoE (middle layers)
  • Shared router groups for efficiency

5. EQ Adapter

Emotional intelligence module:

config.use_eq_adapter = True
config.eq_loss_weight = 0.1
  • Frustration detection (regression)
  • 8-emotion classification
  • Fused with attention mechanism

6. Ring Attention (Experimental)

For longer contexts:

config.use_ring_attention = True
config.max_seq_len = 32768
config.ring_attention_chunk_size = 8192
config.ring_attention_overlap = 2048

Evaluation

Code Generation

python -m evaluation.benchmark \
  --model_path ./outputs/checkpoint-final \
  --benchmarks humaneval mbpp

Reasoning

python -m evaluation.benchmark \
  --model_path ./outputs/checkpoint-final \
  --benchmarks gsm8k math

Emotional Intelligence

Custom evaluation with emotional benchmarks (see evaluation module).

File Structure

Zenith/V1/7B/
β”œβ”€β”€ configs/
β”‚   β”œβ”€β”€ zenith_config.py    # Model configuration
β”‚   β”œβ”€β”€ data_config.py      # Data processing config
β”‚   └── training_config.py  # Training hyperparameters
β”œβ”€β”€ data/
β”‚   β”œβ”€β”€ openthoughts_processor.py
β”‚   β”œβ”€β”€ quality_filter.py
β”‚   β”œβ”€β”€ curriculum_sampler.py
β”‚   β”œβ”€β”€ advanced_tokenizer.py
β”‚   └── preprocessing.py
β”œβ”€β”€ models/
β”‚   β”œβ”€β”€ zenith_model.py
β”‚   β”œβ”€β”€ dense_layer.py
β”‚   └── moe_layer.py
β”œβ”€β”€ utils/
β”‚   β”œβ”€β”€ checkpoint.py
β”‚   β”œβ”€β”€ logging_utils.py
β”‚   └── metrics.py
β”œβ”€β”€ training/
β”‚   β”œβ”€β”€ trainer.py
β”‚   └── train_7b.py
β”œβ”€β”€ tests/
β”‚   └── evaluation/
β”œβ”€β”€ train.py                # Main training script
β”œβ”€β”€ inference.py            # Inference and generation
β”œβ”€β”€ test_model.py           # Model validation tests
β”œβ”€β”€ finetune_qwen.py        # Qwen fine-tuning guide
β”œβ”€β”€ modeling_zenith.py      # Hugging Face integration
β”œβ”€β”€ Modelfile               # Ollama configuration
β”œβ”€β”€ requirements.txt        # Python dependencies
β”œβ”€β”€ README.md               # Full documentation
β”œβ”€β”€ FINETUNE_GUIDE.md       # Detailed fine-tuning guide
└── hf_model_card.md        # This file (Hugging Face model card)

Performance

Benchmarks (Expected)

Benchmark Zenith-7B Qwen2.5-Coder-7B
HumanEval ~75% ~72%
MBPP ~80% ~77%
GSM8K ~65% ~60%
MATH ~45% ~42%

Actual results may vary based on training data and fine-tuning.

Training Requirements

  • Full Fine-Tuning: 16GB+ VRAM
  • LoRA: 8GB+ VRAM
  • QLoRA: 4GB+ VRAM
  • Training Time: ~2-3 days on A100 for full dataset

Limitations

  • Trained primarily on English text
  • May exhibit biases present in training data
  • Code generation should be reviewed for security
  • Long context (32K) requires significant memory
  • MoE and EQ adapters increase memory usage

Ethical Considerations

This model is intended for research and development purposes. Users should:

  • Review generated code for security vulnerabilities
  • Be aware of potential biases in outputs
  • Use emotional intelligence features responsibly
  • Comply with all applicable laws and regulations

Citation

@misc{zenith-7b-2025,
  title={Zenith-7B: A Hybrid MoE Model for Code and Emotional Intelligence},
  author={Zenith Project},
  year={2025},
  publisher={Zenith Project},
  license={Apache-2.0}
}

License

Apache 2.0

Contact

For issues and questions:

  • Open an issue on Hugging Face
  • Check documentation in README.md and FINETUNE_GUIDE.md

Acknowledgments