YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
π Wolof β French Translator (NLLB-based) πΈπ³π«π·
This is a Wolof β French translation model based on Meta AIβs NLLB (No Language Left Behind) architecture, fine-tuned specifically for high-quality bilingual translation in both directions.
π§ Developed by GalsenAI, a Senegal-based open initiative promoting artificial intelligence for African languages.
π§ About the Model
- Base architecture:
facebook/nllb-200-distilled-600M - Supported languages: Wolof (
wo) β French (fr) - Purpose: To enable reliable translation for real-world applications like education, healthcare, and public services.
- BLEU score: 13 (on a custom Wolof-French evaluation set)
π How to Use
π¦ Install dependencies
pip install transformers torch
π Example usage
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
checkpoint = "galsenai/wolofToFrenchTranslator_nllb"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForSeq2SeqLM.from_pretrained(checkpoint).to(device)
def predict(text, lang):
if lang.lower() == "wo":
prefix = "translate Wolof to French: "
elif lang.lower() == "fr":
prefix = "translate French to Wolof: "
else:
raise ValueError("Invalid language code")
inputs = tokenizer(prefix + text, return_tensors="pt").to(device)
translated_tokens = model.generate(**inputs, max_length=30)
return tokenizer.batch_decode(translated_tokens, skip_special_tokens=True)[0]
# Example
result = predict("Naka nga def?", lang="wo")
print(result) # "Comment Γ§a va?"
π Performance
| Translation Direction | BLEU Score |
|---|---|
| Wolof β French | 13 |
| French β Wolof | 13 |
Note: BLEU score is an indicator of translation quality. Further training will improve results.
π Training Data
The model was fine-tuned using a mix of:
- Manually aligned WolofβFrench parallel corpora
- Public resources (Common Voice, Wikipedia, administrative documents, etc.)
- Custom datasets collected via LinguaSprint Africa, a crowdsourcing platform for African languages.
π€ Contributing
This model is maintained by GalsenAI.
If youβd like to:
- Help improve this model
- Contribute more Wolof/French data
- Build NLP tools for African languages
π Join us at github.com/GalsenAI or reach out to the team!
π License
MIT License β free to use for research, education, and social applications. π£ Attribution requested: GalsenAI (2025)
- Downloads last month
- 59
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support