Ngiemboon→French Fine-Tuned NLLB-200

Model Overview

This repository hosts a fine-tuned version of facebook/nllb-200-distilled-600M adapted for translation from Ngiemboon (nnh) to French (fra). The model was trained on the mimba/text2text dataset using Hugging Face Seq2SeqTrainer.

Training Details

Base model: facebook/nllb-200-distilled-600M
Dataset: mimba/text2text (Ngiemboon→French pairs)
Tokenizer: NLLB-200 tokenizer with added special token __ngiemboon__
Max length: 128 tokens (95th percentile of source sentences)
Batch size: 8
Epochs: 3
Evaluation metric: SacreBLEU
Framework: Hugging Face Transformers

Usage

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

model_name = "mimba/nllb200-ngiemboon2fr"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

text = "__ngiemboon__ Ngiemboon phrase de test"
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Evaluation

BLEU score computed with SacreBLEU on validation set.
Qualitative translations show fluent French outputs for typical Ngiemboon sentences.

Limitations

May truncate very long sentences (>128 tokens).
Performance depends on dataset size and domain coverage.
Designed primarily for Ngiemboon→French; reverse direction not fine-tuned.

BibTeX entry and citation info

If you use this model, please cite:

@misc{mimba2026ngiemboon,
  author = {Mimba},
  title = {Ngiemboon→French Fine-Tuned NLLB-200},
  year = {2026},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/mimba/nllb200-ngiemboon2fr}}
}

Downloads last month: 70

Safetensors

Model size

0.6B params

Tensor type

F32

Model tree for mimba/nllb200-ngiemboon2fr

Base model

facebook/nllb-200-distilled-600M

Finetuned

(236)

this model

mimba
/

nllb200-ngiemboon2fr