m2m100_1.2b-eng-lug

This model is part of the AfriScience-MT project, focused on machine translation of scientific texts for African languages.

Model Description

Property	Value
Model Type	Seq2Seq Translation
Translation Direction	English → Luganda
Base Model	facebook/m2m100_1.2B
Domain	Scientific/Academic texts
Training	Full fine-tuning on AfriScience-MT dataset

Evaluation Results

Performance on the AfriScience-MT test set:

Split	BLEU	chrF	SSA-COMET
Validation	23.59	51.46	65.32
Test	21.27	49.48	64.32

Metrics explanation:

BLEU: Measures n-gram overlap with reference translations (0-100, higher is better)
chrF: Character-level F-score, robust for morphologically rich languages (0-100, higher is better)
SSA-COMET: Neural metric trained for Sub-Saharan African languages, shown as percentage (0-100, higher is better) (McGill-NLP/ssa-comet-stl)

Usage

Quick Start

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model_id = "dsfsi/m2m100_1.2b-eng-lug"
model = AutoModelForSeq2SeqLM.from_pretrained(model_id)
tokenizer = AutoTokenizer.from_pretrained(model_id)

# Set source language
tokenizer.src_lang = "en"

# Translate
text = "The mitochondria is the powerhouse of the cell."
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=256)

# Generate with target language
forced_bos_token_id = tokenizer.get_lang_id("lg")
outputs = model.generate(**inputs, forced_bos_token_id=forced_bos_token_id, max_length=256, num_beams=5)
translation = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
print(translation)

Batch Translation

texts = [
    "Climate change affects agricultural productivity.",
    "The study analyzed genetic markers in the population.",
    "Renewable energy sources are essential for sustainable development."
]

inputs = tokenizer(texts, return_tensors="pt", padding=True, truncation=True, max_length=256)
outputs = model.generate(**inputs, forced_bos_token_id=forced_bos_token_id, max_length=256, num_beams=5)
translations = tokenizer.batch_decode(outputs, skip_special_tokens=True)
for src, tgt in zip(texts, translations):
    print(f"{src}\n→ {tgt}\n")

Training Details

Hyperparameters

Parameter	Value
Epochs	10
Batch Size	1
Learning Rate	2e-05

Training Data

Dataset: AfriScience-MT
Domain: Scientific abstracts and papers
Languages: English and 6 African languages (Amharic, Hausa, Luganda, Northern Sotho, Yoruba, isiZulu)

Reproducibility

To reproduce this model:

# Clone the AfriScience-MT repository
git clone https://github.com/afriscience-mt/afriscience-mt.git
cd afriscience-mt

# Install dependencies
pip install -r requirements.txt

# Run training
python -m afriscience_mt.scripts.run_seq2seq_training \
    --data_dir ./data \
    --source_lang eng \
    --target_lang lug \
    --model_name facebook/m2m100_1.2B \
    --model_type m2m100 \
    --output_dir ./output \
    --num_epochs 10 \
    --batch_size 16 \
    --learning_rate 2e-5

Limitations

Domain Specificity: This model is optimized for scientific/academic texts and may perform poorly on colloquial or informal text.
Language Coverage: Only supports the specific language pair indicated.
Input Length: Maximum input length is 256 tokens; longer texts should be split into segments.

Citation

If you use this model, please cite the AfriScience-MT paper (arXiv:2605.29741):

@article{abdulmumin2026afriscience,
  title   = {AfriScience-MT: Towards Decolonizing Science in Africa through Text Translation},
  author  = {Abdulmumin, Idris and Gwadabe, Tajuddeen and Muhammad, Shamsuddeen Hassan and Adelani, David Ifeoluwa and Khalo, Nomonde and Ahmad, Ibrahim Said and Modupe, Abiodun and Mumm, Anina and Biyela, Sibusiso and Rabie, Michelle and Havemann, Johanna and Rei, Marek and Abbott, Jade and Marivate, Vukosi},
  journal = {arXiv preprint arXiv:2605.29741},
  year    = {2026},
  url     = {https://arxiv.org/abs/2605.29741}
}

License

This model is released under the Apache 2.0 License.

Acknowledgments

Built on top of {base_model}
Evaluation using SSA-COMET for African language assessment

Downloads last month: 3

Safetensors

Model size

1B params

Tensor type

F32

Model tree for dsfsi/m2m100_1.2b-eng-lug

Base model

facebook/m2m100_1.2B

Finetuned

(35)

this model

Collection including dsfsi/m2m100_1.2b-eng-lug

AfriScience-MT

Collection

AfriScience-MT (arXiv:2605.29741): MT models for African scientific text in Amharic, Hausa, Luganda, N. Sotho, Yoruba and isiZulu. • 254 items • Updated about 4 hours ago

Paper for dsfsi/m2m100_1.2b-eng-lug

AfriScience-MT: Towards Decolonizing Science in Africa through Text Translation

Paper • 2605.29741 • Published 8 days ago

Evaluation results

BLEU (test)
self-reported

21.270
chrF (test)
self-reported

49.480
SSA-COMET (test)
self-reported

64.320