File size: 3,010 Bytes
4d23010 a5fb927 4d23010 a5fb927 8e3cd22 a5fb927 8e3cd22 a5fb927 8e3cd22 a5fb927 8e3cd22 a5fb927 8e3cd22 a5fb927 8e3cd22 4d23010 bf9f463 4d23010 bf9f463 4d23010 bf9f463 4d23010 a5fb927 4d23010 a5fb927 91653ce a5fb927 91653ce bf9f463 91653ce a5fb927 91653ce a5fb927 4d23010 bf9f463 5b0734e b0a438b 4d23010 5b0734e bf9f463 5b0734e 4d23010 bf9f463 5b0734e 4d23010 bf9f463 5b0734e 4d23010 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 |
---
language:
- en
- ny
- bem
tags:
- sentiment-analysis
- multilingual
- transformer
- zambia
- lusaka
license: apache-2.0
library_name: transformers
pipeline_tag: text-classification
base_model:
- google-bert/bert-base-multilingual-cased
datasets:
- michsethowusu/english-chichewa_sentence-pairs_mt560
- michsethowusu/Code-170k-bemba
- Beijuka/BEMBA_big_c
metrics:
- accuracy
- precision
- recall
- f1
- confusion_matrix
- validation_loss
model-index:
- name: LusakaLang
results:
- task:
type: text-classification
name: Sentiment Analysis
dataset:
name: LusakaLang Test Set
type: lusakalang
config: default
split: test
metrics:
- type: accuracy
value: 0.9973
name: accuracy
- type: precision
value: 0.9973
name: precision
- type: recall
value: 0.9973
name: recall
- type: f1
value: 0.9978
name: f1
---
## **Lusaka Language Analysis Model**
The Lusaka Language Analysis is a multilingual sentiment classification model fine‑tuned from `google-bert/bert-base-multilingual-cased (mBERT)`.
and it is built specifically for Zambian linguistic contexts with a focus on:
- Zambian English (Lusaka variety)
- Bemba
- Nyanja (Chichewa)
The model is optimized to recognize mixed-language usage, local idioms, indirect expressions, and contextual sarcasm commonly found in everyday
Zambian communication and social media discourse.
---
## Task
```python
def classify_text(text):
"""
Run inference on a single text input using the fine‑tuned LusakaLang model.
Returns the predicted label and confidence score.
"""
result = classifier(text)[0]
label = result["label"]
score = round(result["score"], 4)
return label, score
samples = [
"Muli shani bane, nalishiba bwino.",
"How are you doing today?",
"Tili bwino, zikomo kwambiri."
]
for s in samples:
label, score = classify_text(s)
print(f"Text: {s}\nPrediction: {label} (confidence={score})\n")
```
## Sample Output
```python
Text: Muli shani bane, nalishiba bwino.
Prediction: Bemba (confidence=0.9821)
Text: How are you doing today?
Prediction: English (confidence=0.9954)
Text: Tili bwino, zikomo kwambiri.
Prediction: Nyanja (confidence=0.9736)
```
---
## Language Graph

> Note: The unknown langauge here represents a Mixed language of English, Bemba and Nyanja of varying degrees e.g GPS yenze nama issues so it made me delay my journey kwati nibamudala.
## Classification Report

## Confusion Matrix

## Word Cloud

|