AvarNLP — NLLB-600M Fine-tuned for Avar-Turkish Translation

The world's first machine translation model for the Avar language (МагIарул мацI).

Model Description

This model is a LoRA fine-tuned version of Meta's NLLB-200-distilled-600M, trained on an evolutionarily generated Avar-Turkish parallel corpus.

  • Base model: facebook/nllb-200-distilled-600M
  • Fine-tuning method: LoRA (Low-Rank Adaptation) via PEFT
  • Training data: Evolved from ~1K seed pairs using genetic algorithms
  • Directions: Avar → Turkish, Turkish → Avar

About the Avar Language

Name Avar (МагIарул мацI)
Family Northeast Caucasian (Nakh-Dagestanian)
Speakers ~800,000
Region Dagestan (Russia), Turkey, Europe
UNESCO Status Definitely Endangered
Digital Resources Virtually none

Status

🚧 Under active development — model not yet trained. Check back soon or follow this repo for updates.

Links

Citation

@software{avarnlp2026,
  title     = {AvarNLP: Self-Evolving AI for the Endangered Avar Language},
  author    = {Arif Akgun},
  url       = {https://github.com/Burtinsaw/AvarNLP},
  year      = {2026},
  license   = {MIT}
}
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Burtinsaw/avarnlp-nllb-600m

Adapter
(68)
this model