🏔️ DevLake: NLLB-200 (1.3B) for Russian-Bashkir Translation

Current Model Architecture Focus
🔴 Large (This Model) NLLB-1.3B (QLoRA) Best Quality (SOTA)
🟡 Medium Model M2M-100 (418M) Balanced
🟢 Small Model MarianMT (77M) Fastest / CPU

Model Description

This is the High-Performance model submitted by Team DevLake for the LoResMT 2026 Shared Task. It achieved the highest score in our experiments (52.67 CHRF++), significantly outperforming standard baselines.

It is a fine-tuned version of NLLB-200-1.3B-Distilled, trained using QLoRA (4-bit quantization) on a rigorously filtered subset of the Russian-Bashkir parallel corpus.

  • Paper/Code: GitHub Repository
  • Developed by: DevLake Team
  • Language Pair: Russian (rus_Cyrl) $\to$ Bashkir (bak_Cyrl)

🏆 Performance

Model Size CHRF++ Note
DevLake NLLB 1.3B 52.67 Best Morphology & Syntax
DevLake M2M 418M 48.80 Good baseline
DevLake Marian 77M 43.15 Fast but hallucinates

🚀 Usage

Due to the use of QLoRA, this model requires peft and bitsandbytes.

pip install torch transformers peft bitsandbytes accelerate
import torch
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
from peft import PeftModel

# 1. Load Base Model (NLLB)
base_model_id = "facebook/nllb-200-1.3B"
model = AutoModelForSeq2SeqLM.from_pretrained(
    base_model_id,
    load_in_4bit=True,
    device_map="auto"
)

# 2. Load DevLake Adapters
adapter_model_id = "Voldis/nllb-1.3b-rus-bak"
model = PeftModel.from_pretrained(model, adapter_model_id)
tokenizer = AutoTokenizer.from_pretrained(adapter_model_id)

# 3. Inference
text = "Утром я выпил чашку кофе."
inputs = tokenizer(text, return_tensors="pt").to("cuda")

with torch.no_grad():
    generated_tokens = model.generate(
        **inputs,
        forced_bos_token_id=tokenizer.convert_tokens_to_ids("bak_Cyrl"),
        max_length=128
    )

print(tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)[0])

🛠️ Training Details

  • Filtering: We used a BERT-based metric to select only the top 486k sentence pairs (Semantic Similarity $\ge 0.80$).
  • Method: QLoRA (Rank=64, Alpha=64).
  • Hardware: Trained on a single NVIDIA RTX 3080.

📚 Citation

@article{devlake2026loresmt,
  title={DevLake at LoResMT 2026: The Impact of Pre-training and Model Scale on Russian-Bashkir Low-Resource Translation},
  author={DevLake Team},
  year={2026}
}
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Voldis/nllb-1.3b-rus-bak

Adapter
(14)
this model

Dataset used to train Voldis/nllb-1.3b-rus-bak

Evaluation results