Swahili-English Neural Machine Translation Model (V2)
Model Description
swahili-model-v2 is a Neural Machine Translation (NMT) model designed to translate text from English to Swahili.
This model is a fine-tuned version of the Helsinki-NLP/opus-mt-en-sw Transformer, adapted specifically for high-accuracy translation tasks using a curated parallel corpus.
By leveraging Transfer Learning on a substantial dataset of 281,000 sentence pairs, this model achieves a BLEU score of 41.63 on the validation set.
It demonstrates professional-grade grammatical fluency and robust vocabulary alignment, significantly outperforming baseline models trained from scratch V1.
Dataset Characteristics
The model was trained on a specific subset of the Swahili-English Parallel Corpus. Prior to training, a comprehensive Exploratory Data Analysis was conducted to ensure data quality and alignment.
Sentence Length Distribution
The dataset follows a long-tailed distribution typical of natural language corpora.
Most sentences are between 5 and 20 words long, which is optimal training.
Figure 3: English Sentence Lengths
Figure 4: Swahili Sentence Lengths
Source-Target Alignment
There is a strong linear correlation between English and Swahili sentence lengths.
This indicates a high-quality parallel corpus with few alignment errors.
Figure 5: Length Correlation
The regression line shows a consistent mapping ratio between source and target lengths.
Training Details
Dataset Configuration
- Source: Swahili-English Parallel Corpus.
- Size: 281,000 sentence pairs.
- Split: 90% Training, 10% Validation.
- Preprocessing: Tokenization using the Helsinki-NLP SentencePiece tokenizer with dynamic padding.
Performance and Evaluation
The model was evaluated using the BLEU (Bilingual Evaluation Understudy) metric, which measures the similarity between the machine-generated translation and professional human reference translations.
Evaluation Results
- BLEU Score: 41.63
- Validation Loss: 0.8659
These metrics indicate high translation quality, with the model successfully capturing complex sentence structures rather than performing simple word-for-word substitution.
Training Results Table
The following table summarizes the model's performance metrics across all 5 training epochs.
| Epoch | Training Loss | Validation Loss | BLEU Score |
|---|---|---|---|
| 1.0 | 1.0863 | 1.0334 | 28.48 |
| 2.0 | 0.9630 | 0.9337 | 32.06 |
| 3.0 | 0.8826 | 0.8913 | 37.38 |
| 4.0 | 0.8266 | 0.8708 | 40.07 |
| 5.0 | 0.8036 | 0.8659 | 41.63 |
Training Metrics
The training process demonstrated stable convergence with no signs of overfitting. The graphs below illustrate the progression of the BLEU score and Training Loss over 5 epochs.
Figure 1: BLEU Score Progression
The model achieved a rapid increase in translation quality, stabilizing above 40 BLEU by the final epoch.
Figure 2: Loss Convergence
Validation loss consistently decreased, confirming that the model effectively generalized to unseen data.
Intended Uses and Limitations
Intended Uses
This model is suitable for general-purpose translation tasks, including:
- Educational Tools: Assisting learners in understanding English-Swahili sentence structures.
- Content Localization: Translating web content, documentation, or simple narratives into Swahili.
- Communication Aids: Facilitating basic written communication across language barriers.
- NLP Research: Serving as a baseline for low-resource language modeling experiments.
Limitations
- Domain Specificity: The model may struggle with highly technical, medical, or legal jargon that was not present in the training corpus.
- Context Length: As a sentence-level translator, it may lose context when translating very long paragraphs as a single block.
- Dialect Variations: Swahili has multiple dialects; this model aligns primarily with standard Swahili (Kiswahili Sanifu) and may not accurately capture regional slang or informal variations (Sheng).
Usage
You can use this model directly with the Hugging Face transformers library.
Python Example
from transformers import pipeline
# Load the translation pipeline
translator = pipeline("translation", model="codeshujaaa/swahili-model-v2")
# Define input text
text = "I am learning to speak Swahili today."
# Generate translation
translation = translator(text)
print(translation[0]['translation_text'])
Citation
If you use this model in your work, please cite the original architecture authors and this repository
@misc{Denis Mwangi,
author = {Denis Mwangi},
title = {Fine-Tuned Swahili-English Neural Machine Translation Model},
year = {2025},
publisher = {Hugging Face},
howpublished = {\url{[https://huggingface.co/codeshujaaa/swahili-model-v2](https://huggingface.co/codeshujaaa/swahili-model-v2)}}
}
- Downloads last month
- 2
Model tree for codeshujaaa/swahili-model-v2
Base model
Helsinki-NLP/opus-mt-en-swEvaluation results
- Bleu on Swahili Parallel Corpusself-reported41.631




