metadata
language:
- en
- ny
- bem
tags:
- sentiment-analysis
- multilingual
- transformer
- zambia
- lusaka
license: apache-2.0
library_name: transformers
pipeline_tag: text-classification
base_model:
- google-bert/bert-base-multilingual-cased
datasets:
- michsethowusu/english-chichewa_sentence-pairs_mt560
- michsethowusu/Code-170k-bemba
- Beijuka/BEMBA_big_c
metrics:
- accuracy
- precision
- recall
- f1
- confusion_matrix
- validation_loss
model-index:
- name: LusakaLang
results:
- task:
type: text-classification
name: Sentiment Analysis
dataset:
name: LusakaLang Test Set
type: lusakalang
config: default
split: test
metrics:
- type: accuracy
value: 0.9973
name: accuracy
- type: precision
value: 0.9973
name: precision
- type: recall
value: 0.9973
name: recall
- type: f1
value: 0.9978
name: f1
Lusaka Language Analysis Model
The Lusaka Language Analysis is a multilingual sentiment classification model fine‑tuned from google-bert/bert-base-multilingual-cased (mBERT).
and it is built specifically for Zambian linguistic contexts with a focus on:
- Zambian English (Lusaka variety)
- Bemba
- Nyanja (Chichewa)
The model is optimized to recognize mixed-language usage, local idioms, indirect expressions, and contextual sarcasm commonly found in everyday Zambian communication and social media discourse.
Task
def classify_text(text):
"""
Run inference on a single text input using the fine‑tuned LusakaLang model.
Returns the predicted label and confidence score.
"""
result = classifier(text)[0]
label = result["label"]
score = round(result["score"], 4)
return label, score
samples = [
"Muli shani bane, nalishiba bwino.",
"How are you doing today?",
"Tili bwino, zikomo kwambiri."
]
for s in samples:
label, score = classify_text(s)
print(f"Text: {s}\nPrediction: {label} (confidence={score})\n")
Sample Output
Text: Muli shani bane, nalishiba bwino.
Prediction: Bemba (confidence=0.9821)
Text: How are you doing today?
Prediction: English (confidence=0.9954)
Text: Tili bwino, zikomo kwambiri.
Prediction: Nyanja (confidence=0.9736)
Language Graph
Note: The unknown langauge here represents a Mixed language of English, Bemba and Nyanja of varying degrees e.g GPS yenze nama issues so it made me delay my journey kwati nibamudala.



