Ngiemboon→French Fine-Tuned NLLB-200
Model Overview
This repository hosts a fine-tuned version of facebook/nllb-200-distilled-600M adapted for translation from Ngiemboon (nnh) to French (fra).
The model was trained on the mimba/text2text dataset using Hugging Face Seq2SeqTrainer.
Training Details
- Base model: facebook/nllb-200-distilled-600M
- Dataset: mimba/text2text (Ngiemboon→French pairs)
- Tokenizer: NLLB-200 tokenizer with added special token
__ngiemboon__ - Max length: 128 tokens (95th percentile of source sentences)
- Batch size: 8
- Epochs: 3
- Evaluation metric: SacreBLEU
- Framework: Hugging Face Transformers
Usage
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
model_name = "mimba/nllb200-ngiemboon2fr"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
text = "__ngiemboon__ Ngiemboon phrase de test"
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Evaluation
- BLEU score computed with SacreBLEU on validation set.
- Qualitative translations show fluent French outputs for typical Ngiemboon sentences.
Limitations
- May truncate very long sentences (>128 tokens).
- Performance depends on dataset size and domain coverage.
- Designed primarily for Ngiemboon→French; reverse direction not fine-tuned.
BibTeX entry and citation info
If you use this model, please cite:
@misc{mimba2026ngiemboon,
author = {Mimba},
title = {Ngiemboon→French Fine-Tuned NLLB-200},
year = {2026},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/mimba/nllb200-ngiemboon2fr}}
}
- Downloads last month
- 70
Model tree for mimba/nllb200-ngiemboon2fr
Base model
facebook/nllb-200-distilled-600M