File size: 2,484 Bytes

---
license: gpl-3.0
language:
- az
base_model:
- Yusiko/Khazri
tags:
- aze
- mini
- yusiko
---


# 🌪️Khazri — Azerbaijani Language Model
**A lightweight, efficient, and fully custom Azerbaijani language model designed for text generation, chat applications, education, and research.**
Khazri is trained from scratch using a custom 10M-sample Azerbaijani dataset and optimized for running on consumer GPUs while maintaining competitive performance.

## 🌟 Features
- 🇦🇿 Native Azerbaijani language support
- ⚡ Lightweight architecture (≈ 36M parameter)
- 🚀 Supports fast inference with GGUF + llama.cpp
- 📦 Available on Hugging Face
- 🎯 Optimized for chatbots, WebRTC real-time assistants, and low-latency deployment

## 🏗️ Model Architecture
| Version | Parameters | Type | Context Length | Notes |
|--------|------------|------|----------------|-------|
| Khazri-36M | ~36.6M | GPT-2 Small variant | 1024 | Higher quality |

Architecture:
- Transformer decoder-only
- Multi-head self-attention
- Rotary positional embeddings (RoPE)
- GELU activation
- Layer normalization
- Tied embeddings

## 📚 Dataset
Khazri is trained on a 10 million-sample Azerbaijani dataset including:
- News, books, conversations, social media, web articles, educational content

Preprocessing:
- Unicode normalization, deduplication, tokenizer preprocessing, length filtering

## 🏋️ Training Details
### Hardware
- NVIDIA RTX 3090 24GB
- PyTorch 2.x + CUDA 12
- bf16 mixed precision

### Hyperparameters
```
epochs = 1
batch_size = 32
gradient_accumulation = 4
learning_rate = 3e-4
warmup_steps = 500
weight_decay = 0.1
sequence_length = 512
optimizer = AdamW
precision = bf16
```

## 📈 Training Challenges & Solutions
### Bottleneck: Memory Bandwidth
Small models saturate VRAM bandwidth → ~4.2 it/s  
Solution: shrink model size, adjust batch/accumulation, optimize data loading

### Transformers
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
tok = AutoTokenizer.from_pretrained("Yusiko/Khazri")
model = AutoModelForCausalLM.from_pretrained("Yusiko/Khazri")
```

## 🌐 Hugging Face
Available at: https://huggingface.co/Yusiko/Khazri

## 📦 License
GPL 3.0 License

## 🌍 Future Plans
- 1B+ model
- Better tokenizer
- Instruction-tuning
- WebGPU inference
- Community fine-tuning tools

## 🤝 Contact
Created by **Yusiko**  
GitHub: [Yusiko99](https://github.com/Yusiko99)  
Website: https://yusi.xo.je  
Hugging Face: Yusiko