|
|
--- |
|
|
tags: |
|
|
- luganda |
|
|
- iteso |
|
|
- runyankore |
|
|
- acholi |
|
|
- ateso |
|
|
- mistral-7b |
|
|
- fine-tuned |
|
|
--- |
|
|
|
|
|
# Alkebulan AI: Fine-Tuned Mistral-7B for Ugandan Languages |
|
|
|
|
|
This is a fine-tuned version of the Mistral-7B model for **Luganda**, **Iteso**, **Runyankore**, **Acholi**, and **Ateso** languages. The model is trained on **parallel datasets** to enable translation and basic interaction in these languages. |
|
|
|
|
|
## Languages Supported |
|
|
- Luganda |
|
|
- Iteso |
|
|
- Runyankore |
|
|
- Acholi |
|
|
- Ateso |
|
|
|
|
|
## Training Data |
|
|
The model was fine-tuned on parallel datasets containing: |
|
|
- English to Luganda |
|
|
- English to Iteso |
|
|
- English to Runyankore |
|
|
- English to Acholi |
|
|
- English to Ateso |
|
|
|
|
|
## Usage |
|
|
```python |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
|
|
tokenizer = AutoTokenizer.from_pretrained("Psalms23Wave/Alkebulan-AI") |
|
|
model = AutoModelForCausalLM.from_pretrained("Psalms23Wave/Alkebulan-AI") |
|
|
|
|
|
inputs = tokenizer("Translate to Luganda: Hello, how are you?", return_tensors="pt") |
|
|
outputs = model.generate(**inputs) |
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
|
|
|
## Disclaimer |
|
|
This model is licensed under the CC BY-NC-ND 4.0 license. |
|
|
It is intended for **non-commercial use only. |
|
|
You may not use this model for commercial purposes, |
|
|
and you may not distribute modified versions of the model. |