πͺοΈKhazri β Azerbaijani Language Model
A lightweight, efficient, and fully custom Azerbaijani language model designed for text generation, chat applications, education, and research. Khazri is trained from scratch using a custom 10M-sample Azerbaijani dataset and optimized for running on consumer GPUs while maintaining competitive performance.
π Features
- π¦πΏ Native Azerbaijani language support
- β‘ Lightweight architecture (β 36M parameter)
- π Supports fast inference with GGUF + llama.cpp
- π¦ Available on Hugging Face
- π― Optimized for chatbots, WebRTC real-time assistants, and low-latency deployment
ποΈ Model Architecture
| Version | Parameters | Type | Context Length | Notes |
|---|---|---|---|---|
| Khazri-36M | ~36.6M | GPT-2 Small variant | 1024 | Higher quality |
Architecture:
- Transformer decoder-only
- Multi-head self-attention
- Rotary positional embeddings (RoPE)
- GELU activation
- Layer normalization
- Tied embeddings
π Dataset
Khazri is trained on a 10 million-sample Azerbaijani dataset including:
- News, books, conversations, social media, web articles, educational content
Preprocessing:
- Unicode normalization, deduplication, tokenizer preprocessing, length filtering
ποΈ Training Details
Hardware
- NVIDIA RTX 3090 24GB
- PyTorch 2.x + CUDA 12
- bf16 mixed precision
Hyperparameters
epochs = 1
batch_size = 32
gradient_accumulation = 4
learning_rate = 3e-4
warmup_steps = 500
weight_decay = 0.1
sequence_length = 512
optimizer = AdamW
precision = bf16
π Training Challenges & Solutions
Bottleneck: Memory Bandwidth
Small models saturate VRAM bandwidth β ~4.2 it/s
Solution: shrink model size, adjust batch/accumulation, optimize data loading
Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
tok = AutoTokenizer.from_pretrained("Yusiko/Khazri")
model = AutoModelForCausalLM.from_pretrained("Yusiko/Khazri")
π Hugging Face
Available at: https://huggingface.co/Yusiko/Khazri
π¦ License
GPL 3.0 License
π Future Plans
- 1B+ model
- Better tokenizer
- Instruction-tuning
- WebGPU inference
- Community fine-tuning tools
π€ Contact
Created by Yusiko
GitHub: Yusiko99
Website: https://yusi.xo.je
Hugging Face: Yusiko
- Downloads last month
- 38
Model tree for Yusiko/Khazri
Unable to build the model tree, the base model loops to the model itself. Learn more.