🌍 Multilingual SLM — Ateso · Luganda · English · Runyankore · Japadhola
A lightweight, multilingual Small Language Model (SLM) fine-tuned for question-and-answer tasks across five languages spoken in Uganda and East Africa. Built on top of CohereLabs/tiny-aya-global using LoRA (PEFT), this model is optimized for low-resource, local-language understanding.
Model Details
| Field | Details |
|---|---|
| Base Model | CohereLabs/tiny-aya-global |
| Fine-tuning Method | LoRA (PEFT) |
| Task | Question Answering (QA) |
| Languages | Ateso, Luganda, English, Runyankore, Japadhola |
| Training Samples | 90K custom QA pairs |
| Framework | Transformers + PEFT 0.18.1 |
| License | Apache 2.0 |
Supported Languages
| Language | Code | Region |
|---|---|---|
| English | en |
International |
| Luganda | lug |
Central Uganda |
| Runyankore | nyn |
Western Uganda |
| Ateso | teo |
Eastern Uganda / Northern Kenya |
| Japadhola | dho |
Eastern Uganda |
Intended Use
✅ Direct Use
This model is designed for question-and-answer inference in multilingual East African contexts. It is suitable for:
- Building local-language chatbots and virtual assistants
- Educational tools for Ugandan language communities
- Research into low-resource NLP for African languages
- Prototyping QA systems before scaling to larger datasets
🔧 Downstream Use
The model can be further fine-tuned or integrated into:
- Mobile or web-based community knowledge bases
- Agricultural, health, or civic information systems in local languages
- Language learning applications
❌ Out-of-Scope Use
- High-stakes or safety-critical applications without additional evaluation
- Languages not covered in training (the model may produce low-quality outputs)
- Tasks beyond question-answering (e.g., code generation, summarization) without further fine-tuning
How to Get Started
Installation
pip install transformers peft torch
Inference
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
base_model_id = "CohereLabs/tiny-aya-global"
adapter_id = "Bateesa/tiny-aya-global-lora-qa"
tokenizer = AutoTokenizer.from_pretrained(base_model_id)
model = AutoModelForCausalLM.from_pretrained(base_model_id)
model = PeftModel.from_pretrained(model, adapter_id)
model.eval()
def ask(question: str) -> str:
prompt = f"Question: {question}\nAnswer:"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=128)
return tokenizer.decode(outputs[0], skip_special_tokens=True)
# English
print(ask("What is the capital of Uganda?"))
# Luganda
print(ask("Ekibuga ekikulembera Uganda kye ki?"))
# Runyankore
print(ask("Obwakabaka bw'Uganda nibuki?"))
Training Details
Training Data
- Dataset size: 90K custom QA pairs
- Format: Instruction-style prompt/response pairs (
Question: ... \nAnswer: ...) - Languages: Balanced across Ateso, Luganda, English, Runyankore, and Japadhola
- Source: Manually curated domain-specific questions and answers relevant to East African contexts
Training Procedure
Fine-tuned using LoRA (Low-Rank Adaptation) via the HuggingFace PEFT library on top of CohereLabs/tiny-aya-global.
Training Hyperparameters
| Parameter | Value |
|---|---|
| Method | LoRA |
| PEFT Version | 0.18.1 |
| Training regime | fp16 mixed precision |
| LoRA rank (r) | 8 |
| LoRA alpha | 16 |
| LoRA dropout | 0.05 |
| Target modules | q_proj, v_proj |
| Epochs | 3 |
| Batch size | 4 |
| Learning rate | 2e-4 |
Evaluation
Testing Data
Held-out subset from the 90K custom QA samples, with manual review of responses across all five languages.
Metrics
- Qualitative review: Human evaluation of answer relevance and fluency per language
- BLEU / ROUGE: Planned for future evaluation with expanded dataset
Results
⚠️ This model is trained on a small dataset of 90 samples. Performance may vary across languages and domains. It is best used as a baseline or proof-of-concept. Expanding the training dataset is strongly recommended for production use.
Bias, Risks, and Limitations
- Small dataset (90K samples): The model may hallucinate or give incorrect answers, particularly for rare or complex questions.
- Language imbalance: If training samples were not evenly distributed, some languages may perform better than others.
- Cultural context: The model may not capture nuanced cultural meanings or idiomatic expressions in all five languages.
- No safety fine-tuning: This model has not been RLHF-tuned or filtered for harmful outputs.
Recommendations
Users should validate model outputs before deploying in community-facing applications. Additional data collection and evaluation is recommended, especially for Ateso and Japadhola which have fewer NLP resources available.
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact Calculator.
| Field | Details |
|---|---|
| Hardware Type | GPU (e.g., T4 / A100) |
| Training Duration | ~1–2 hours (estimated for 90 samples) |
| Cloud Provider | TBD |
| Carbon Emitted | Low (small dataset + LoRA adapter only) |
Citation
If you use this model in your research or application, please cite:
@misc{multilingual-slm-ug,
title = {Multilingual SLM for Ugandan Languages: Ateso, Luganda, English, Runyankore, Japadhola},
author = {PhosAI},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/Bateesa/tiny-aya-global-lora-qa}
}
Model Card Contact
For questions, feedback, or collaboration inquiries, please open an issue on the model repository or contact [your contact info].
Framework Versions
- PEFT
0.18.1 - Transformers
≥ 4.38.0 - PyTorch
≥ 2.0
- Downloads last month
- 57