--- language: - en - ny - bem tags: - sentiment-analysis - multilingual - transformer - zambia - lusaka license: apache-2.0 library_name: transformers pipeline_tag: text-classification base_model: - google-bert/bert-base-multilingual-cased datasets: - michsethowusu/english-chichewa_sentence-pairs_mt560 - michsethowusu/Code-170k-bemba - Beijuka/BEMBA_big_c metrics: - accuracy - precision - recall - f1 - confusion_matrix - validation_loss model-index: - name: LusakaLang results: - task: type: text-classification name: Sentiment Analysis dataset: name: LusakaLang Test Set type: lusakalang config: default split: test metrics: - type: accuracy value: 0.9973 name: accuracy - type: precision value: 0.9973 name: precision - type: recall value: 0.9973 name: recall - type: f1 value: 0.9978 name: f1 --- ## **Lusaka Language Analysis Model** The Lusaka Language Analysis is a multilingual sentiment classification model fine‑tuned from `google-bert/bert-base-multilingual-cased (mBERT)`. and it is built specifically for Zambian linguistic contexts with a focus on: - Zambian English (Lusaka variety) - Bemba - Nyanja (Chichewa) The model is optimized to recognize mixed-language usage, local idioms, indirect expressions, and contextual sarcasm commonly found in everyday Zambian communication and social media discourse. --- ## Task ```python def classify_text(text): """ Run inference on a single text input using the fine‑tuned LusakaLang model. Returns the predicted label and confidence score. """ result = classifier(text)[0] label = result["label"] score = round(result["score"], 4) return label, score samples = [ "Muli shani bane, nalishiba bwino.", "How are you doing today?", "Tili bwino, zikomo kwambiri." ] for s in samples: label, score = classify_text(s) print(f"Text: {s}\nPrediction: {label} (confidence={score})\n") ``` ## Sample Output ```python Text: Muli shani bane, nalishiba bwino. Prediction: Bemba (confidence=0.9821) Text: How are you doing today? Prediction: English (confidence=0.9954) Text: Tili bwino, zikomo kwambiri. Prediction: Nyanja (confidence=0.9736) ``` --- ## Language Graph ![image](https://cdn-uploads.huggingface.co/production/uploads/674ed988f86d2ca07fa23abe/OTroxtjtYgvijaMcv4Tpn.png) > Note: The unknown langauge here represents a Mixed language of English, Bemba and Nyanja of varying degrees e.g GPS yenze nama issues so it made me delay my journey kwati nibamudala. ## Classification Report ![image](https://cdn-uploads.huggingface.co/production/uploads/674ed988f86d2ca07fa23abe/v5eLxfxuKDJ7Sd8uX2P9s.png) ## Confusion Matrix ![image](https://cdn-uploads.huggingface.co/production/uploads/674ed988f86d2ca07fa23abe/mxnDjRmAX-XLHzMfcWnfr.png) ## Word Cloud ![image](https://cdn-uploads.huggingface.co/production/uploads/674ed988f86d2ca07fa23abe/J-atqadjfCh7xUKRSRSnL.png)