File size: 1,680 Bytes
bc8882f 1b0ea56 3958b6e 1b0ea56 3958b6e 1b0ea56 3958b6e 1b0ea56 3958b6e e867843 3958b6e e867843 3958b6e 7f4726a 3958b6e 7f4726a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 |
---
license: mit
metrics:
- bleu
base_model:
- facebook/nllb-200-distilled-600M
tags:
- nlp,
- low-resource,
- efik,
- african-language,
- translation,
---
# Efik ↔ English Translation Model
This model provides **machine translation between English and Efik**. It was fine-tuned on **18k+ parallel sentences** using the **NLLB architecture** and can be used for both direct translation and integration into multilingual NLP pipelines.
### Uses
- Translate text between English and Efik.
- Assist in educational or localization projects involving Efik.
- Support research in low-resource language NLP.
### Limitations
- Due to limited data, performance may decrease for **long, complex, or domain-specific text**.
### How to Get Started
```python
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("offiongbassey/efik-mt")
model = AutoModelForSeq2SeqLM.from_pretrained("offiongbassey/efik-mt")
# English → Efik
text = "My child is very sick and I need to take him to the hospital for treatment."
inputs = tokenizer(f"eng_Latn {text}", return_tensors="pt")
outputs = model.generate(**inputs, max_length=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
# Efik → English
text = "Okon ama adaha utom tọñọ usenubọk."
inputs = tokenizer(f"ibo_Latn {text}", return_tensors="pt")
outputs = model.generate(**inputs, max_length=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
### Training Details
- Architecture: NLLB
- Epochs trained: 8
- Learning Rate: 5e-05
- BLEU Scores:
- EN → EF: 29.58
- EF → EN: 32.14
- chrF:
- EN → EF: 54.29
- EF → EN: 48.78 |