🏔️ DevLake: MarianMT (77M) for Russian-Bashkir

Current Model Architecture Focus
🔴 Large Model NLLB-1.3B (QLoRA) Best Quality (SOTA)
🟡 Medium Model M2M-100 (418M) Balanced
🟢 Small (This Model) MarianMT (Full FT) Fastest / CPU Friendly

Model Description

This is the Lightweight model from Team DevLake. Unlike the larger models, this is trained from scratch (transfer learning from English-Turkish) with a manually expanded vocabulary to support Bashkir Cyrillic characters.

It is designed for environments with limited resources (CPU deployment, mobile, etc.).

  • Score: 43.15 CHRF++
  • Parameters: 77M (Very fast)
  • Caveat: May struggle with complex semantic nuances compared to NLLB.

⚠️ Important Usage Note

You must prepend the special token >>bak<< to the source sentence.

🚀 Usage

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

model_id = "Voldis/marian-rus-bak"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSeq2SeqLM.from_pretrained(model_id)

# Note the prefix!
text = ">>bak<< Добрый день, друзья!"
inputs = tokenizer(text, return_tensors="pt")

outputs = model.generate(**inputs, max_length=128)
print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0])

🛠️ Training Details

  • Hardware: Trained on a single NVIDIA RTX 3080.

📚 Citation

@article{devlake2026loresmt,
  title={DevLake at LoResMT 2026: The Impact of Pre-training and Model Scale on Russian-Bashkir Low-Resource Translation},
  author={DevLake Team},
  year={2026}
}
Downloads last month
4
Safetensors
Model size
75.8M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Voldis/marian-rus-bak

Finetuned
(2)
this model

Dataset used to train Voldis/marian-rus-bak