LusakaLang — Multilingual Sentiment Classification Model (English, Bemba, Nyanja)

Model Description

LusakaLang is a fine‑tuned version of bert-base-multilingual-cased designed for multilingual sentiment analysis in Zambia’s linguistic landscape. It is optimized for Zambian English, Bemba, Nyanja, and the highly common code‑switching patterns used in Lusaka and other urban regions.

The model captures:

  • Zambian English idioms
  • Bemba and Nyanja sentiment cues
  • Mixed‑language slang
  • Urban Lusaka code‑switching
  • Indirect emotional expressions common in Zambian communication

This makes LusakaLang highly effective for real‑world sentiment tasks such as customer feedback, social media monitoring, and conversational analysis.


Training Performance (Epoch 30 — Final Model)

The model was trained for 30 epochs, with epoch 30 selected as the optimal checkpoint based on macro‑F1 performance and generalization stability.

Final Test Results (Epoch 30)

Metric Score
Accuracy 0.9322
Macro Precision 0.9216
Macro Recall 0.9216
Macro F1 0.9216
Test Loss 0.4025

Per‑Class Performance

Class Precision Recall F1
Negative 0.8649 0.8649 0.8649
Neutral 0.95 0.95 0.95
Positive 0.95 0.95 0.95

These results show strong generalization, excellent balance across classes, and robust performance on the hardest class (negative).


Training Data

train_results4

The model was trained using a multilingual dataset combining:

  • Zambian English
  • Bemba
  • Nyanja
  • Code‑switched text
  • Social media‑style expressions
  • Local idioms and sentiment cues

Why LusakaLang Performs Better

1. Understanding Zambian English Nuances

Examples:

  • “I’m just there” → Neutral
  • “I’m not fine but I’m okay” → Neutral
  • “I’m feeling somehow” → Neutral
  • “Believe you me” → Neutral
  • “It’s fine” → Negative (Zambian tone)

2. Handling Bemba/Nyanja Idioms

Examples:

  • “Nimvela bwino” → Positive
  • “Nimvelako bwino but…” → Neutral
  • “Nima one boi” → Negative
  • “Niba kalijo baja naiwe” → Negative

3. Code‑Switching Awareness

The model handles:

  • English + Bemba
  • English + Nyanja
  • English + slang
  • Mixed 3‑language expressions

4. Sarcasm Detection (Zambian Style)

Examples:

  • “Wow, great service” → Negative
  • “Nice, just what I needed” → Negative
  • “Perfect timing!” → Negative

Bias, Risks, and Limitations

  • Optimized for Zambia; may not generalize to other African regions.
  • Sarcasm and indirect expressions can still be ambiguous.
  • Not suitable for high‑risk decision‑making without human review.
  • Best for short conversational text, not long documents.

How to Use This Model

from transformers import AutoModelForSequenceClassification, AutoTokenizer

model_ckpt = "Kelvinmbewe/LusakaLang"
tokenizer = AutoTokenizer.from_pretrained(model_ckpt)
model = AutoModelForSequenceClassification.from_pretrained(model_ckpt)
from transformers import pipeline

classifier = pipeline("text-classification", model="Kelvinmbewe/LusakaLang")
classifier("Driver was very professional and polite.")
def label_text(text):
    result = classifier(text)[0]
    sentiment = result['label'].lower()
    mapping = {"negative": 0, "neutral": 1, "positive": 2}
    return mapping[sentiment], sentiment

print(label_text("Umufyashi ailetelela bwino no mutende."))  # Bemba
print(label_text("Galimoto inachedwa koma woyendetsa anali wabwino."))  # Nyanja
print(label_text("The ride was okay, but the driver was over speeding."))  # English

image

image

image

image

Downloads last month
-
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Kelvinmbewe/mbert_LusakaLang_Sentiment_Analysis

Finetuned
(926)
this model

Datasets used to train Kelvinmbewe/mbert_LusakaLang_Sentiment_Analysis

Evaluation results