Configuration Parsing
Warning:
Invalid JSON for config file config.json
Configuration Parsing
Warning:
Invalid JSON for config file tokenizer_config.json
Kolkha-Mini
Kolkha-Mini is a lightweight language model fine-tuned to specialize in the Georgian language.
It is intended as an early-stage foundation model for Georgian-focused NLP work.
This model prioritizes coherence and language exposure over grammatical perfection and should be treated as a base to build upon, not a production-ready assistant.
Base Model
- Qwen/Qwen3-1.7B
Fine-Tuning Overview
- Method: QLoRA (4-bit)
- Training type: Causal Language Modeling
- Epochs: 2
- Context length: 1024 tokens
- Optimizer: paged AdamW (8-bit)
- Scheduler: cosine
- Precision: FP16 compute, NF4 quantized base during training
The final model provided here is a fully merged FP16 model (no LoRA adapters required).
Training Details (High-Level)
- Base model loaded in 4-bit NF4 using bitsandbytes
- LoRA applied to all major attention and MLP projection layers:
q_proj,k_proj,v_proj,o_projgate_proj,up_proj,down_proj
- Dataset manually packed into fixed 1024-token blocks to maximize GPU utilization
- Chat templates applied prior to tokenization
- Gradient checkpointing enabled for stability
Training was intentionally kept simple and stable, favoring correctness over experimental tricks.
Current Capabilities & Limitations
What it does well
- Produces coherent Georgian text
- Understands Georgian sentence structure
- Serves as a solid starting point for further fine-tuning
Known issues
- Grammatically incorrect sentences are common
- Occasional hallucinations
- Sometimes invents non-existent words
- Not instruction-tuned or safety-aligned
These issues are expected given dataset size and training duration.
Performance is expected to improve significantly with a larger and cleaner dataset.
Intended Use
- Georgian language research
- Further fine-tuning
- Dataset experimentation
- Low-resource language modeling
Not recommended for:
- Production deployment
- High-stakes or factual tasks
- Safety-critical applications
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"GiorgiGE/Kolkha-Mini-Georgian",
torch_dtype="auto"
)
tokenizer = AutoTokenizer.from_pretrained(
"GiorgiGE/Kolkha-Mini-Georgian"
)
- Downloads last month
- 2
