AvarNLP — NLLB-600M Fine-tuned for Avar-Turkish Translation

The world's first machine translation model for the Avar language (МагIарул мацI).

Model Description

This model is a LoRA fine-tuned version of Meta's NLLB-200-distilled-600M, trained on an evolutionarily generated Avar-Turkish parallel corpus.

Base model: facebook/nllb-200-distilled-600M
Fine-tuning method: LoRA (Low-Rank Adaptation) via PEFT
Training data: Evolved from ~1K seed pairs using genetic algorithms
Directions: Avar → Turkish, Turkish → Avar

About the Avar Language


Name	Avar (МагIарул мацI)
Family	Northeast Caucasian (Nakh-Dagestanian)
Speakers	~800,000
Region	Dagestan (Russia), Turkey, Europe
UNESCO Status	Definitely Endangered
Digital Resources	Virtually none

Status

🚧 Under active development — model not yet trained. Check back soon or follow this repo for updates.

Citation

@software{avarnlp2026,
  title     = {AvarNLP: Self-Evolving AI for the Endangered Avar Language},
  author    = {Arif Akgun},
  url       = {https://github.com/Burtinsaw/AvarNLP},
  year      = {2026},
  license   = {MIT}
}

Downloads last month: -

Model tree for Burtinsaw/avarnlp-nllb-600m

Base model

facebook/nllb-200-distilled-600M

Adapter

(68)

this model