DevLake: MarianMT (77M) for Russian-Bashkir

Current Model Architecture Focus
🔴 Large Model NLLB-1.3B (QLoRA) Best Quality (SOTA)
🟡 Medium Model M2M-100 (418M) Balanced
🟢 Small (This Model) MarianMT (Full FT) Fastest / CPU Friendly

Model Description

This is the Lightweight model from Team DevLake. Unlike the larger models, this is trained from scratch (transfer learning from English-Turkish) with a manually expanded vocabulary to support Bashkir Cyrillic characters.

It is designed for environments with limited resources (CPU deployment, mobile, etc.).

  • Score: 43.15 CHRF++
  • Parameters: 77M (Very fast)
  • Caveat: May struggle with complex semantic nuances compared to NLLB.

Important Usage Note

You must prepend the special token >>bak<< to the source sentence.

Usage

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

model_id = "Voldis/marian-rus-bak"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSeq2SeqLM.from_pretrained(model_id)

# Note the prefix!
text = ">>bak<< Добрый день, друзья!"
inputs = tokenizer(text, return_tensors="pt")

outputs = model.generate(**inputs, max_length=128)
print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0])

Training Details

  • Hardware: Trained on a single NVIDIA RTX 3080.

Citation

@inproceedings{tyurin-2026-devlake,
    title = "{D}ev{L}ake at {L}o{R}es{MT} 2026: The Impact of Pre-training and Model Scale on {R}ussian-{B}ashkir Low-Resource Translation",
    author = "Tyurin, Vyacheslav",
    booktitle = "Proceedings for the Ninth Workshop on Technologies for Machine Translation of Low Resource Languages (LoResMT 2026)",
    month = mar,
    year = "2026",
    address = "Rabat, Morocco",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2026.loresmt-1.18",
    doi = "10.18653/v1/2026.loresmt-1.18",
    pages = "209--212",
}
Downloads last month
3
Safetensors
Model size
75.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Voldis/marian-rus-bak

Finetuned
(2)
this model

Dataset used to train Voldis/marian-rus-bak