Instructions to use eligapris/kin-eng with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use eligapris/kin-eng with Transformers:
# Use a pipeline as a high-level helper # Warning: Pipeline type "translation" is no longer supported in transformers v5. # You must load the model directly (see below) or downgrade to v4.x with: # 'pip install "transformers<5.0.0' from transformers import pipeline pipe = pipeline("translation", model="eligapris/kin-eng")# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("eligapris/kin-eng") model = AutoModelForSeq2SeqLM.from_pretrained("eligapris/kin-eng") - Notebooks
- Google Colab
- Kaggle
Model Details
Model Description
This is a Machine Translation model, finetuned from NLLB-200's distilled 1.3B model, it is meant to be used in machine translation for education-related data.
- Finetuning code repository: the code used to finetune this model can be found here
How to Get Started with the Model
Use the code below to get started with the model.
Training Procedure
The model was finetuned on three datasets; a general purpose dataset, a tourism, and an education dataset.
The model was finetuned in two phases.
Phase one:
- General purpose dataset
- Education dataset
- Tourism dataset
Phase two:
- Education dataset
Other than the dataset changes between phase one, and phase two finetuning; no other hyperparameters were modified. In both cases, the model was trained on an A100 40GB GPU for two epochs.
Evaluation
Metrics
Model performance was measured using BLEU, spBLEU, TER, and chrF++ metrics.
Results
| Lang. Direction | BLEU | spBLEU | chrf++ | TER |
|---|---|---|---|---|
| Eng -> Kin | 45.96 | 59.20 | 68.79 | 41.61 |
| Kin -> Eng | 43.98 | 44.94 | 63.05 | 41.41 |
- Downloads last month
- -
Datasets used to train eligapris/kin-eng
Preview • Updated • 119 • 1
mbazaNLP/Kinyarwanda_English_parallel_dataset
Viewer • Updated • 55.7k • 11 • 1
mbazaNLP/NMT_Tourism_parallel_data_en_kin
Viewer • Updated • 28.3k • 4 • 1