File size: 5,231 Bytes
6f0db5f cd7c5a4 6f0db5f cee8898 6f0db5f 920051b 6f0db5f cee8898 6f0db5f cee8898 cd7c5a4 cee8898 920051b cd7c5a4 920051b cd7c5a4 cee8898 920051b cee8898 98be87c cee8898 920051b cd7c5a4 cee8898 cd7c5a4 cee8898 cd7c5a4 cee8898 920051b cd7c5a4 cee8898 cd7c5a4 cee8898 cd7c5a4 cee8898 cd7c5a4 cee8898 920051b cee8898 920051b cee8898 920051b cd7c5a4 cee8898 cd7c5a4 98be87c 920051b 98be87c 920051b cd7c5a4 cee8898 cd7c5a4 cee8898 920051b cee8898 cd7c5a4 cee8898 cd7c5a4 cee8898 cd7c5a4 920051b |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 |
---
base_model: unsloth/llama-3.2-3b-instruct
tags:
- text-generation-inference
- transformers
- unsloth
- llama
- trl
license: apache-2.0
language:
- en
- sw
datasets:
- saillab/alpaca_swahili_taco
metrics:
- bleu
- accuracy
- cer
- rouge
pipeline_tag: text-generation
---
# π§ SALAMA LLM β Swahili Instruction-Tuned Text Generation Model
**π¨βπ» Developer:** AI4NNOV
**βοΈ Authors:** AI4NNOV
**π¦ Version:** v1.0
**π License:** Apache 2.0
**π οΈ Model Type:** Instruction-Tuned Large Language Model
**π§© Base Model:** `Jacaranda/UlizaLlama`
---
## π Overview
**SALAMA LLM** is the **language understanding and generation engine** of the **SALAMA Framework** β a modular Speech-to-Speech (STS) AI pipeline built for African languages.
The model is fine-tuned on Swahili instruction datasets to enable natural, culturally relevant responses in text generation, summarization, question answering, and translation.
This model represents a major step in bridging the linguistic digital divide by providing **high-quality Swahili AI text generation** capabilities within an open, scalable framework.
---
## π§±οΈ Model Architecture
SALAMA LLM is based on **Jacaranda/UlizaLlama**, fine-tuned using **Parameter-Efficient Fine-Tuning (PEFT)** via **LoRA/QLoRA**.
The architecture supports mixed Swahili-English text inputs while focusing on fluent Swahili text generation for both casual and formal domains.
| Parameter | Value |
|------------|--------|
| **Base Model** | `Jacaranda/UlizaLlama` |
| **Fine-Tuning** | QLoRA / LoRA (PEFT) |
| **Precision** | 4-bit quantization |
| **Optimizer** | AdamW |
| **Learning Rate** | 2e-5 |
| **Epochs** | 3β5 |
| **Frameworks** | Transformers, TRL, PEFT, Unsloth |
| **Languages** | Swahili (sw), English (en) |
---
## π Datasets
| Dataset | Description | Purpose |
|----------|--------------|----------|
| `saillab/alpaca_swahili_taco` | Swahili Alpaca-style instruction-response dataset | Instruction tuning |
| `Jacaranda/kiswallama-pretrained` | 321M Swahili tokens, custom tokenizer (20K vocab) | Base Swahili adaptation |
| Custom Swahili QA corpus | Curated Q&A and summarization samples | Conversational fine-tuning |
---
## π§ Model Capabilities
β
Text generation in **Swahili and English**
β
Instruction-following, summarization, and dialogue
β
Question answering and translation (EN β SW)
β
Sentiment and named-entity recognition
β
Contextually and culturally aligned text generation
---
## π Evaluation Metrics
| Metric | Score | Description |
|---------|-------|-------------|
| **BLEU** | 0.49 | Measures fluency and translation accuracy |
| **ROUGE-L** | 0.61 | Summarization recall and overlap |
| **Accuracy (QA)** | 95.5% | Accuracy on Swahili QA tasks |
| **CER** | 0.28 | Character Error Rate |
| **F1 (avg)** | 0.90+ | Weighted average across tasks |
---
## βοΈ Usage (Python Example)
Below is a quick example to load and use **SALAMA LLM** for Swahili text generation:
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
# Load model and tokenizer
model_name = "EYEDOL/salama-llm" # Change to your Hugging Face repo name
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Swahili text prompt
prompt = "Andika sentensi fupi kuhusu umuhimu wa elimu."
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=120,
temperature=0.7,
top_p=0.9,
repetition_penalty=1.05
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
**𦩠Example Output:**
> βElimu ni msingi wa maendeleo, humwezesha mtu kuelewa dunia na kuboresha maisha yake na jamii kwa ujumla.β
---
## β‘ Key Features
- π§© Optimized for African low-resource NLP contexts
- π¬ Instruction-following in Swahili and English
- βοΈ Lightweight and efficient (QLoRA fine-tuned; runs on single 24 GB GPU)
- π Culturally aligned text generation
- π¦Ά Open-source and extendable to other African languages
---
## π« Limitations
- β οΈ May underperform with heavy code-switching (Swahili-English mix)
- π€ Not yet optimized for rare dialects or poetic forms
- π Limited exposure to specialized (medical/legal) corpora
- π Relies on accurate STT transcription in end-to-end speech-to-speech use
---
## π Related Models
| Model | Description |
|--------|-------------|
| [`EYEDOL/salama-stt`](https://huggingface.co/EYEDOL/salama-stt) | Swahili Speech-to-Text model (Whisper-small fine-tuned) |
| [`EYEDOL/salama-tts`](https://huggingface.co/EYEDOL/salama-tts) | Swahili Text-to-Speech model (VITS architecture) |
---
## π§Ύ Citation
If you use **SALAMA LLM**, please cite:
```bibtex
@misc{salama_llm_2025,
title={SALAMA LLM: Swahili Instruction-Tuned Text Generation Model},
author={AI4NNOV},
year={2025},
publisher={Hugging Face},
howpublished={\url{https://huggingface.co/EYEDOL/salama-llm}}
}
```
---
**π‘ βElimu ni msingi wa maendeleo β Knowledge is the foundation of progress.β**
|