SALAMA_LLM / README.md
EYEDOL's picture
Update README.md
920051b verified
---
base_model: unsloth/llama-3.2-3b-instruct
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- trl
license: apache-2.0
language:
- en
- sw
datasets:
- saillab/alpaca_swahili_taco
metrics:
- bleu
- accuracy
- cer
- rouge
pipeline_tag: text-generation
---
# 🧠 SALAMA LLM β€” Swahili Instruction-Tuned Text Generation Model
**πŸ‘¨β€πŸ’» Developer:** AI4NNOV
**✍️ Authors:** AI4NNOV
**πŸ“¦ Version:** v1.0
**πŸ“œ License:** Apache 2.0
**πŸ› οΈ Model Type:** Instruction-Tuned Large Language Model
**🧩 Base Model:** `Jacaranda/UlizaLlama`
---
## 🌍 Overview
**SALAMA LLM** is the **language understanding and generation engine** of the **SALAMA Framework** β€” a modular Speech-to-Speech (STS) AI pipeline built for African languages.
The model is fine-tuned on Swahili instruction datasets to enable natural, culturally relevant responses in text generation, summarization, question answering, and translation.
This model represents a major step in bridging the linguistic digital divide by providing **high-quality Swahili AI text generation** capabilities within an open, scalable framework.
---
## 🧱️ Model Architecture
SALAMA LLM is based on **Jacaranda/UlizaLlama**, fine-tuned using **Parameter-Efficient Fine-Tuning (PEFT)** via **LoRA/QLoRA**.
The architecture supports mixed Swahili-English text inputs while focusing on fluent Swahili text generation for both casual and formal domains.
| Parameter | Value |
|------------|--------|
| **Base Model** | `Jacaranda/UlizaLlama` |
| **Fine-Tuning** | QLoRA / LoRA (PEFT) |
| **Precision** | 4-bit quantization |
| **Optimizer** | AdamW |
| **Learning Rate** | 2e-5 |
| **Epochs** | 3–5 |
| **Frameworks** | Transformers, TRL, PEFT, Unsloth |
| **Languages** | Swahili (sw), English (en) |
---
## πŸ“š Datasets
| Dataset | Description | Purpose |
|----------|--------------|----------|
| `saillab/alpaca_swahili_taco` | Swahili Alpaca-style instruction-response dataset | Instruction tuning |
| `Jacaranda/kiswallama-pretrained` | 321M Swahili tokens, custom tokenizer (20K vocab) | Base Swahili adaptation |
| Custom Swahili QA corpus | Curated Q&A and summarization samples | Conversational fine-tuning |
---
## 🧠 Model Capabilities
βœ… Text generation in **Swahili and English**
βœ… Instruction-following, summarization, and dialogue
βœ… Question answering and translation (EN ↔ SW)
βœ… Sentiment and named-entity recognition
βœ… Contextually and culturally aligned text generation
---
## πŸ“Š Evaluation Metrics
| Metric | Score | Description |
|---------|-------|-------------|
| **BLEU** | 0.49 | Measures fluency and translation accuracy |
| **ROUGE-L** | 0.61 | Summarization recall and overlap |
| **Accuracy (QA)** | 95.5% | Accuracy on Swahili QA tasks |
| **CER** | 0.28 | Character Error Rate |
| **F1 (avg)** | 0.90+ | Weighted average across tasks |
---
## βš™οΈ Usage (Python Example)
Below is a quick example to load and use **SALAMA LLM** for Swahili text generation:
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
# Load model and tokenizer
model_name = "EYEDOL/salama-llm" # Change to your Hugging Face repo name
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Swahili text prompt
prompt = "Andika sentensi fupi kuhusu umuhimu wa elimu."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=120,
temperature=0.7,
top_p=0.9,
repetition_penalty=1.05
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
**🦩 Example Output:**
> β€œElimu ni msingi wa maendeleo, humwezesha mtu kuelewa dunia na kuboresha maisha yake na jamii kwa ujumla.”
---
## ⚑ Key Features
- 🧩 Optimized for African low-resource NLP contexts
- πŸ’¬ Instruction-following in Swahili and English
- βš™οΈ Lightweight and efficient (QLoRA fine-tuned; runs on single 24 GB GPU)
- 🌍 Culturally aligned text generation
- 🦢 Open-source and extendable to other African languages
---
## 🚫 Limitations
- ⚠️ May underperform with heavy code-switching (Swahili-English mix)
- πŸ‘€ Not yet optimized for rare dialects or poetic forms
- πŸ“š Limited exposure to specialized (medical/legal) corpora
- πŸ”Š Relies on accurate STT transcription in end-to-end speech-to-speech use
---
## πŸ”— Related Models
| Model | Description |
|--------|-------------|
| [`EYEDOL/salama-stt`](https://huggingface.co/EYEDOL/salama-stt) | Swahili Speech-to-Text model (Whisper-small fine-tuned) |
| [`EYEDOL/salama-tts`](https://huggingface.co/EYEDOL/salama-tts) | Swahili Text-to-Speech model (VITS architecture) |
---
## 🧾 Citation
If you use **SALAMA LLM**, please cite:
```bibtex
@misc{salama_llm_2025,
title={SALAMA LLM: Swahili Instruction-Tuned Text Generation Model},
author={AI4NNOV},
year={2025},
publisher={Hugging Face},
howpublished={\url{https://huggingface.co/EYEDOL/salama-llm}}
}
```
---
**πŸ’‘ β€œElimu ni msingi wa maendeleo β€” Knowledge is the foundation of progress.”**