Zenith-7b-V1 / hf_model_card.md

Zandy-Wandy

Upload Zenith-7B model

1ea8a03 verified about 1 month ago

preview code

raw

history blame contribute delete

8.68 kB

metadata

language:
  - en
license: apache-2.0
tags:
  - code-generation
  - reasoning
  - emotional-intelligence
  - moe
  - long-context
  - open-thoughts
  - zenith
  - 7b
  - qwen2.5-coder
datasets:
  - open-thoughts/OpenThoughts3-1.2M
model-index:
  - name: Zenith-7B-V1
    results: []

Zenith-7B-V1

Production-ready 7B parameter model with code generation, reasoning, and emotional intelligence.

Zenith-7B is a state-of-the-art language model fine-tuned from Qwen2.5-Coder-7B, enhanced with advanced features including:

🎯 Code Generation: Exceptional programming abilities across multiple languages
🧠 Reasoning: Strong performance on algorithmic and mathematical problems
💖 Emotional Intelligence: EQ adapter for recognizing and responding to emotions
📚 OpenThoughts Integration: Trained on high-quality reasoning data from OpenThoughts-1.2M
🔄 MoE Support: Optional Mixture of Experts for sparse activation
📏 Long Context: 8K context window (extendable to 32K with ring attention)
⚡ Efficient Fine-Tuning: Full support for LoRA and QLoRA

Model Details

Architecture

Base Model: Qwen/Qwen2.5-Coder-7B
Parameters: 7.0B (dense) or configurable with MoE
Hidden Size: 4096
Layers: 32
Attention Heads: 32
Vocabulary Size: 32,000 (configurable)

Training

Dataset: OpenThoughts-1.2M with custom filtering and curriculum learning
Sequence Length: Up to 8192 tokens
Training Method: Full fine-tuning or LoRA/QLoRA
Mixed Precision: BF16/FP16 support

Usage

Installation

pip install torch transformers accelerate peft bitsandbytes

Load Model

from transformers import AutoModelForCausalLM, AutoTokenizer
from configs.zenith_config import get_7b_config

# Load configuration
config = get_7b_config()

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
    "your-username/Zenith-7B",
    config=config,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("your-username/Zenith-7B")

Inference

# Generate text
prompt = "Write a Python function to reverse a linked list:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Interactive Chat

import torch
from inference import load_model, generate

model, tokenizer = load_model("your-username/Zenith-7B")

while True:
    user_input = input("\nYou: ")
    if user_input.lower() == 'quit':
        break
    response = generate(model, tokenizer, user_input, max_new_tokens=512)
    print(f"\nAssistant: {response}")

Training Your Own

See FINETUNE_GUIDE.md for comprehensive fine-tuning instructions.

Quick Start

# LoRA fine-tuning (recommended)
python finetune_qwen.py \
  --base_model Qwen/Qwen2.5-Coder-7B \
  --train_data ./data/train.json \
  --use_lora \
  --lora_r 16 \
  --epochs 3 \
  --batch_size 8 \
  --learning_rate 1e-4

Full Training

python train.py \
  --base_model Qwen/Qwen2.5-Coder-7B \
  --train_data ./data/train.json \
  --epochs 3 \
  --batch_size 4 \
  --learning_rate 2e-5 \
  --use_quality_filter \
  --use_curriculum

Features

1. OpenThoughts Integration

Seamless integration with OpenThoughts-1.2M dataset:

from data.openthoughts_processor import OpenThoughtsProcessor, OpenThoughtsConfig

config = OpenThoughtsConfig(
    dataset_name="open-thoughts/OpenThoughts3-1.2M",
    streaming=True,
    quality_filtering=True,
    curriculum_learning=True,
    tokenizer=tokenizer
)
processor = OpenThoughtsProcessor(config)
dataset = processor.load_dataset()

2. Quality Filtering

Multi-dimensional quality assessment:

Length appropriateness
Language detection (English)
Repetition detection
Coherence scoring
Structure validation
Thought quality (for CoT data)

3. Curriculum Learning

Progressive training stages:

Foundation: High-quality, well-structured samples
Reasoning: Chain-of-thought and problem-solving
Code: Programming and technical content
Full: Complete dataset

4. MoE (Mixture of Experts)

Enable sparse activation for better performance:

config.num_experts = 8
config.moe_top_k = 2

Top-2 routing with load balancing
60% of layers use MoE (middle layers)
Shared router groups for efficiency

5. EQ Adapter

Emotional intelligence module:

config.use_eq_adapter = True
config.eq_loss_weight = 0.1

Frustration detection (regression)
8-emotion classification
Fused with attention mechanism

6. Ring Attention (Experimental)

For longer contexts:

config.use_ring_attention = True
config.max_seq_len = 32768
config.ring_attention_chunk_size = 8192
config.ring_attention_overlap = 2048

Evaluation

Code Generation

python -m evaluation.benchmark \
  --model_path ./outputs/checkpoint-final \
  --benchmarks humaneval mbpp

Reasoning

python -m evaluation.benchmark \
  --model_path ./outputs/checkpoint-final \
  --benchmarks gsm8k math

Emotional Intelligence

Custom evaluation with emotional benchmarks (see evaluation module).

File Structure

Zenith/V1/7B/
├── configs/
│   ├── zenith_config.py    # Model configuration
│   ├── data_config.py      # Data processing config
│   └── training_config.py  # Training hyperparameters
├── data/
│   ├── openthoughts_processor.py
│   ├── quality_filter.py
│   ├── curriculum_sampler.py
│   ├── advanced_tokenizer.py
│   └── preprocessing.py
├── models/
│   ├── zenith_model.py
│   ├── dense_layer.py
│   └── moe_layer.py
├── utils/
│   ├── checkpoint.py
│   ├── logging_utils.py
│   └── metrics.py
├── training/
│   ├── trainer.py
│   └── train_7b.py
├── tests/
│   └── evaluation/
├── train.py                # Main training script
├── inference.py            # Inference and generation
├── test_model.py           # Model validation tests
├── finetune_qwen.py        # Qwen fine-tuning guide
├── modeling_zenith.py      # Hugging Face integration
├── Modelfile               # Ollama configuration
├── requirements.txt        # Python dependencies
├── README.md               # Full documentation
├── FINETUNE_GUIDE.md       # Detailed fine-tuning guide
└── hf_model_card.md        # This file (Hugging Face model card)

Performance

Benchmarks (Expected)

Benchmark	Zenith-7B	Qwen2.5-Coder-7B
HumanEval	~75%	~72%
MBPP	~80%	~77%
GSM8K	~65%	~60%
MATH	~45%	~42%

Actual results may vary based on training data and fine-tuning.

Training Requirements

Full Fine-Tuning: 16GB+ VRAM
LoRA: 8GB+ VRAM
QLoRA: 4GB+ VRAM
Training Time: ~2-3 days on A100 for full dataset

Limitations

Trained primarily on English text
May exhibit biases present in training data
Code generation should be reviewed for security
Long context (32K) requires significant memory
MoE and EQ adapters increase memory usage

Ethical Considerations

This model is intended for research and development purposes. Users should:

Review generated code for security vulnerabilities
Be aware of potential biases in outputs
Use emotional intelligence features responsibly
Comply with all applicable laws and regulations

Citation

@misc{zenith-7b-2025,
  title={Zenith-7B: A Hybrid MoE Model for Code and Emotional Intelligence},
  author={Zenith Project},
  year={2025},
  publisher={Zenith Project},
  license={Apache-2.0}
}

License

Apache 2.0

Contact

For issues and questions:

Open an issue on Hugging Face
Check documentation in README.md and FINETUNE_GUIDE.md

Matrix-Corp
/

Zenith-7b-V1