🌪️Khazri — Azerbaijani Language Model

A lightweight, efficient, and fully custom Azerbaijani language model designed for text generation, chat applications, education, and research. Khazri is trained from scratch using a custom 10M-sample Azerbaijani dataset and optimized for running on consumer GPUs while maintaining competitive performance.

🌟 Features

🇦🇿 Native Azerbaijani language support
⚡ Lightweight architecture (≈ 36M parameter)
🚀 Supports fast inference with GGUF + llama.cpp
📦 Available on Hugging Face
🎯 Optimized for chatbots, WebRTC real-time assistants, and low-latency deployment

🏗️ Model Architecture

Version	Parameters	Type	Context Length	Notes
Khazri-36M	~36.6M	GPT-2 Small variant	1024	Higher quality

Architecture:

Transformer decoder-only
Multi-head self-attention
Rotary positional embeddings (RoPE)
GELU activation
Layer normalization
Tied embeddings

📚 Dataset

Khazri is trained on a 10 million-sample Azerbaijani dataset including:

News, books, conversations, social media, web articles, educational content

Preprocessing:

Unicode normalization, deduplication, tokenizer preprocessing, length filtering

🏋️ Training Details

Hardware

NVIDIA RTX 3090 24GB
PyTorch 2.x + CUDA 12
bf16 mixed precision

Hyperparameters

epochs = 1
batch_size = 32
gradient_accumulation = 4
learning_rate = 3e-4
warmup_steps = 500
weight_decay = 0.1
sequence_length = 512
optimizer = AdamW
precision = bf16

📈 Training Challenges & Solutions

Bottleneck: Memory Bandwidth

Small models saturate VRAM bandwidth → ~4.2 it/s
Solution: shrink model size, adjust batch/accumulation, optimize data loading

Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
tok = AutoTokenizer.from_pretrained("Yusiko/Khazri")
model = AutoModelForCausalLM.from_pretrained("Yusiko/Khazri")

🌐 Hugging Face

Available at: https://huggingface.co/Yusiko/Khazri

📦 License

GPL 3.0 License

🌍 Future Plans

1B+ model
Better tokenizer
Instruction-tuning
WebGPU inference
Community fine-tuning tools

🤝 Contact

Created by Yusiko
GitHub: Yusiko99
Website: https://yusi.xo.je
Hugging Face: Yusiko

Downloads last month: 6

Safetensors

Model size

47.7M params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Yusiko/Khazri

Unable to build the model tree, the base model loops to the model itself. Learn more.

Collection including Yusiko/Khazri

Khazri

Collection

Khazri is an open-source LLM developed to stregthen the presence of the Azerbaijani language in the field of artifical intelligence • 3 items • Updated Dec 12, 2025 • 1