Kelvinmbewe commited on
Commit
832c90b
·
verified ·
1 Parent(s): e7904a9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -71
README.md CHANGED
@@ -55,53 +55,36 @@ model-index:
55
  name: f1_macro
56
  ---
57
 
 
58
 
59
- # **LusakaLang Multi‑Task Model (Language + Sentiment + Topic)**
60
 
61
- ## **Model Description**
 
 
62
 
63
- **LusakaLang‑MultiTask** is a unified transformer model built on top of **`bert-base-multilingual-cased`**, designed to perform **three tasks simultaneously**:
64
 
65
- 1. **Language Identification**
66
- 2. **Sentiment Analysis**
67
- 3. **Topic Classification**
68
 
69
- The model integrates three fine‑tuned LusakaLang checkpoints:
70
-
71
- - `Kelvinmbewe/mbert_Lusaka_Language_Analysis`
72
- - `Kelvinmbewe/mbert_LusakaLang_Sentiment_Analysis`
73
- - `Kelvinmbewe/mbert_LusakaLang_Topic`
74
-
75
- All tasks share a **single mBERT encoder**, with **three independent classifier heads**.
76
- This architecture improves efficiency, reduces memory footprint, and enables consistent predictions across tasks.
77
 
78
  ---
79
 
80
- # **Why This Model Matters**
81
-
82
- Zambian communication is multilingual, fluid, and highly context‑dependent.
83
- A single message may include:
84
 
85
- - English
86
- - Bemba
87
- - Nyanja
88
- - Slang
89
- - Code‑switching
90
- - Cultural idioms
91
- - Indirect emotional cues
92
 
93
- This model is designed specifically for that environment.
 
 
 
94
 
95
- It excels at:
96
-
97
- - Identifying the **dominant language** or **code‑switching**
98
- - Detecting **sentiment polarity** in culturally nuanced text
99
- - Classifying **topics** such as:
100
- - driver behaviour
101
- - payment issues
102
- - app performance
103
- - customer support
104
- - ride availability
105
 
106
  ---
107
 
@@ -118,34 +101,9 @@ This multi‑task setup improves generalization and reduces inference cost.
118
 
119
  ---
120
 
121
- # **Performance Summary**
122
-
123
- ## **Language Identification**
124
- | Metric | Score |
125
- |--------|--------|
126
- | Accuracy | 0.97 |
127
- | Macro‑F1 | 0.96 |
128
-
129
- ## **Sentiment Analysis (Epoch 30 — Final Checkpoint)**
130
- | Metric | Score |
131
- |--------|--------|
132
- | Accuracy | 0.9322 |
133
- | Macro‑F1 | 0.9216 |
134
- | Negative F1 | 0.8649 |
135
- | Neutral F1 | 0.95 |
136
- | Positive F1 | 0.95 |
137
-
138
- ## **Topic Classification**
139
- | Metric | Score |
140
- |--------|--------|
141
- | Accuracy | 0.91 |
142
- | Macro‑F1 | 0.90 |
143
 
144
- ---
145
 
146
- # **How to Use This Model**
147
-
148
- ## **Load the Multi‑Task Model**
149
 
150
  ```python
151
  from transformers import AutoTokenizer
@@ -182,15 +140,6 @@ predict_topic([
182
  ```
183
 
184
 
185
- ```python
186
- @model{LusakaLangMultiTask,
187
- author = {Kelvin Mbewe},
188
- title = {LusakaLang Multi-Task Model},
189
- year = 2025,
190
- url = {https://huggingface.co/Kelvinmbewe/LusakaLang-MultiTask}
191
- }
192
- ```
193
-
194
 
195
  ```python
196
 
 
55
  name: f1_macro
56
  ---
57
 
58
+ ## **LusakaLang Multi‑Task Model (Language + Sentiment + Topic)**
59
 
60
+ This model is a unified transformer architecture built on top of **`bert-base-multilingual-cased`**, designed to perform **three tasks simultaneously**:
61
 
62
+ 1. **[Language Identification](guide://action?prefill=Tell%20me%20more%20about%3A%20Language%20Identification)**
63
+ 2. **[Sentiment Analysis](guide://action?prefill=Tell%20me%20more%20about%3A%20Sentiment%20Analysis)**
64
+ 3. **[Topic Classification](guide://action?prefill=Tell%20me%20more%20about%3A%20Topic%20Classification)**
65
 
66
+ The system integrates three fine‑tuned LusakaLang checkpoints:
67
 
68
+ - **[Kelvinmbewe/mbert_Lusaka_Language_Analysis](guide://action?prefill=Tell%20me%20more%20about%3A%20Kelvinmbewe%2Fmbert_Lusaka_Language_Analysis)**
69
+ - **[Kelvinmbewe/mbert_LusakaLang_Sentiment_Analysis](guide://action?prefill=Tell%20me%20more%20about%3A%20Kelvinmbewe%2Fmbert_LusakaLang_Sentiment_Analysis)**
70
+ - **[Kelvinmbewe/mbert_LusakaLang_Topic](guide://action?prefill=Tell%20me%20more%20about%3A%20Kelvinmbewe%2Fmbert_LusakaLang_Topic)**
71
 
72
+ All tasks share a single mBERT encoder, supported by three independent classifier heads. This architecture enhances computational efficiency, reduces memory overhead
73
+ and promotes consistent, harmonized predictions across all tasks.
 
 
 
 
 
 
74
 
75
  ---
76
 
77
+ ## **Why This Model Matters**
 
 
 
78
 
79
+ Zambian communication is inherently multilingual, fluid, and deeply shaped by context. A single message may blend English, Bemba, Nyanja, local slang,
80
+ and frequent code‑switching, often expressed through culturally grounded idioms and subtle emotional cues. This model is designed specifically for that
81
+ environment, where meaning depends not only on the words used but on how languages interact within a single utterance.
 
 
 
 
82
 
83
+ It excels at identifying the dominant language or detecting when multiple languages are being used together, interpreting sentiment even when it
84
+ is conveyed indirectly or through culturally specific phrasing, and classifying text into practical topics such as driver behaviour, payment issues,
85
+ app performance, customer support, and ride availability. By capturing these nuances, the model provides a more accurate and context‑aware
86
+ understanding of real Zambian communication.
87
 
 
 
 
 
 
 
 
 
 
 
88
 
89
  ---
90
 
 
101
 
102
  ---
103
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
104
 
105
+ ## **How to Use This Model**
106
 
 
 
 
107
 
108
  ```python
109
  from transformers import AutoTokenizer
 
140
  ```
141
 
142
 
 
 
 
 
 
 
 
 
 
143
 
144
  ```python
145