🏔️ DevLake: MarianMT (77M) for Russian-Bashkir
| Current Model | Architecture | Focus |
|---|---|---|
| 🔴 Large Model | NLLB-1.3B (QLoRA) | Best Quality (SOTA) |
| 🟡 Medium Model | M2M-100 (418M) | Balanced |
| 🟢 Small (This Model) | MarianMT (Full FT) | Fastest / CPU Friendly |
Model Description
This is the Lightweight model from Team DevLake. Unlike the larger models, this is trained from scratch (transfer learning from English-Turkish) with a manually expanded vocabulary to support Bashkir Cyrillic characters.
It is designed for environments with limited resources (CPU deployment, mobile, etc.).
- Score: 43.15 CHRF++
- Parameters: 77M (Very fast)
- Caveat: May struggle with complex semantic nuances compared to NLLB.
⚠️ Important Usage Note
You must prepend the special token >>bak<< to the source sentence.
🚀 Usage
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
model_id = "Voldis/marian-rus-bak"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSeq2SeqLM.from_pretrained(model_id)
# Note the prefix!
text = ">>bak<< Добрый день, друзья!"
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_length=128)
print(tokenizer.batch_decode(outputs, skip_special_tokens=True)[0])
🛠️ Training Details
- Hardware: Trained on a single NVIDIA RTX 3080.
📚 Citation
@article{devlake2026loresmt,
title={DevLake at LoResMT 2026: The Impact of Pre-training and Model Scale on Russian-Bashkir Low-Resource Translation},
author={DevLake Team},
year={2026}
}
- Downloads last month
- 4
Model tree for Voldis/marian-rus-bak
Base model
Helsinki-NLP/opus-mt-en-trk