learn-abc commited on
Commit
01253aa
·
verified ·
1 Parent(s): b9193e3

update README file based on improved version of model

Browse files
Files changed (1) hide show
  1. README.md +159 -169
README.md CHANGED
@@ -4,38 +4,91 @@ tags:
4
  - finance
5
  license: mit
6
  datasets:
7
- - PolyAI/banking77
8
  language:
9
  - en
10
  - bn
11
  base_model:
12
  - google/muril-base-cased
 
 
 
 
 
 
13
  ---
14
 
15
  # Banking Multilingual Intent Classifier
16
 
17
- - **Repository:** `learn-abc/banking-multilingual-intent-classifier`
18
- - **Base Model:** `google/muril-base-cased`
19
- - **Task:** Multilingual Intent Classification (Banking Domain)
20
- - **Languages:** English, Bangla (bn), Bangla Latin (bn-latn), Code-Mixed
 
21
 
22
  ---
23
 
24
- # Model Overview
 
 
 
 
 
 
25
 
26
- This model is a multilingual banking intent classifier fine-tuned on a balanced English–Bangla–Banglish dataset derived from Banking77 and extended with synthetic code-mixed augmentation.
27
 
28
- It is designed for:
29
 
30
- * AI banking assistants
31
- * Multilingual chatbots
32
  * Voice-to-intent pipelines
33
- * Intent routing systems
34
- * Hybrid Bangla-English financial applications
 
 
 
 
 
 
 
 
 
 
 
35
 
36
  ---
37
 
38
- # Supported Intents (14 Classes)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
 
40
  ```
41
  ACCOUNT_INFO
@@ -56,206 +109,143 @@ TRANSFER
56
 
57
  ---
58
 
59
- # Dataset Details
60
 
61
- ### Total Samples
62
-
63
- 66,768
64
-
65
- ### Language Distribution
66
-
67
- * English (en): 22,256
68
- * Bangla (bn): 22,256
69
- * Bangla Latin (bn-latn): 22,256
70
 
71
- ### Code-Mixed Augmentation
 
 
 
 
 
72
 
73
- * 2,500 synthetic code-mixed examples added
74
 
75
  ---
76
 
77
- ### Final Training Split
78
 
79
- * Train: 63,306
80
- * Test: 13,854
81
 
82
- ---
83
 
84
- # Training Configuration
 
 
 
 
 
 
85
 
86
- * Base Model: `google/muril-base-cased`
87
- * Architecture: `BertForSequenceClassification`
88
- * Epochs: 7
89
- * Class weights applied to address imbalance
90
- * Tokenizer: MuRIL tokenizer
91
- * Framework: Hugging Face Transformers
92
 
93
- Note: Some classifier layers were newly initialized (expected when adapting base MuRIL to classification head).
94
 
95
  ---
96
 
97
- # Evaluation Results
98
-
99
- ## Overall Performance
100
-
101
- | Metric | Score |
102
- | --------- | ---------- |
103
- | Accuracy | **99.57%** |
104
- | F1 Micro | **0.9957** |
105
- | F1 Macro | **0.9959** |
106
- | Eval Loss | 0.0178 |
107
-
108
- - Evaluation runtime: 10.1 seconds
109
- - Samples/sec: 1365
 
 
 
 
 
 
 
 
 
 
 
 
 
110
 
111
  ---
112
 
113
- ## Language-wise Performance
114
 
115
- | Language | Accuracy |
116
- | ------------ | -------- |
117
- | English | 99.26% |
118
- | Bangla | 99.80% |
119
- | Bangla Latin | 99.62% |
120
- | Code-Mixed | 100.00% |
 
121
 
122
  ---
123
 
124
- # Multilingual Prediction Examples
125
-
126
- | Input | Language | Prediction |
127
- | ----------------------------- | ---------- | ------------------- |
128
- | what is my balance | en | CHECK_BALANCE |
129
- | আমার ব্যালেন্স কত | bn | CHECK_BALANCE |
130
- | amar balance koto ache | bn-latn | CHECK_BALANCE |
131
- | আমার balance দেখাও | code-mixed | CHECK_BALANCE |
132
- | card ta hariye geche | bn-latn | LOST_OR_STOLEN_CARD |
133
- | weather kemon | code-mixed | FALLBACK |
134
-
135
- All tested predictions returned high confidence (~1.000).
136
 
137
- ---
138
 
139
- # Intended Use Cases
 
 
140
 
141
- * Banking chatbot intent routing
142
- * Voice assistant → STT → Intent classification
143
- * Multilingual customer support
144
- * Code-mixed South Asian applications
145
- * Fintech AI pipelines
146
 
147
  ---
148
 
149
- # Limitations
150
-
151
- 1. Domain-specific: Focused only on banking intents.
152
- 2. Synthetic augmentation: Code-mixed data partially generated programmatically.
153
- 3. Overconfidence: Softmax confidence may saturate near 1.0.
154
- 4. Not tested on adversarial or out-of-distribution queries.
155
- 5. Not designed for generative responses, classification only.
156
 
157
- ---
 
 
158
 
159
- # Architecture Notes
160
 
161
- * Based on MuRIL, optimized for Indian languages.
162
- * Classification head added on top of encoder.
163
- * Some warnings regarding unexpected/missing keys are normal due to task adaptation.
164
- * Class weights applied to handle skewed distribution.
165
 
166
  ---
167
 
168
- # Bias & Fairness
169
 
170
- * Balanced across 3 language representations.
171
- * Augmented for code-mixed robustness.
172
- * May not generalize to:
173
-
174
- * Non-banking domains
175
- * Slang-heavy dialects outside training distribution
176
 
177
  ---
178
 
179
- # Example Usage
180
-
181
- ```python
182
- from transformers import AutoTokenizer, AutoModelForSequenceClassification
183
- import torch
184
-
185
- # Load model and tokenizer
186
- model_name = "learn-abc/banking-multilingual-intent-classifier"
187
- tokenizer = AutoTokenizer.from_pretrained(model_name)
188
- model = AutoModelForSequenceClassification.from_pretrained(model_name)
189
-
190
- device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
191
- model.to(device)
192
- model.eval()
193
-
194
- # Prediction function
195
- def predict_intent(text):
196
- inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=64)
197
- inputs = {k: v.to(device) for k, v in inputs.items()}
198
-
199
- with torch.no_grad():
200
- outputs = model(**inputs)
201
- prediction = torch.argmax(outputs.logits, dim=-1).item()
202
- confidence = torch.softmax(outputs.logits, dim=-1)[0][prediction].item()
203
-
204
- predicted_intent = model.config.id2label[prediction]
205
-
206
- return {
207
- "intent": predicted_intent,
208
- "confidence": confidence
209
- }
210
-
211
- # Example usage - English
212
- result = predict_intent("what is my balance")
213
- print(f"Intent: {result['intent']}, Confidence: {result['confidence']:.2f}")
214
- # Output: Intent: CHECK_BALANCE, Confidence: 0.99
215
-
216
- # Example usage - Bangla
217
- result = predict_intent("আমার ব্যালেন্স কত")
218
- print(f"Intent: {result['intent']}, Confidence: {result['confidence']:.2f}")
219
- # Output: Intent: CHECK_BALANCE, Confidence: 0.98
220
-
221
- # Example usage - Banglish (Romanized)
222
- result = predict_intent("amar balance koto ache")
223
- print(f"Intent: {result['intent']}, Confidence: {result['confidence']:.2f}")
224
- # Output: Intent: CHECK_BALANCE, Confidence: 0.97
225
-
226
- # Example usage - Code-mixed
227
- result = predict_intent("আমার last 10 transaction দেখাও")
228
- print(f"Intent: {result['intent']}, Confidence: {result['confidence']:.2f}")
229
- # Output: Intent: MINI_STATEMENT, Confidence: 0.98
230
- ```
231
 
232
- ---
233
-
234
- # Production Recommendations
235
-
236
- For real-world deployment:
237
-
238
- * Add confidence threshold fallback
239
- * Add OOD detector
240
- * Combine with:
241
-
242
- * STT system
243
- * Intent router
244
- * Business rule engine
245
- * Log misclassifications for continual fine-tuning
246
 
247
  ---
248
 
249
- # Summary
250
 
251
- This model achieves near state-of-the-art multilingual intent classification accuracy for banking-specific queries across:
252
 
253
- * English
254
- * Bangla (native script)
255
- * Bangla Latin
256
- * Code-mixed variants
 
 
257
 
258
- It is optimized for fintech AI systems targeting South Asian multilingual users.
259
 
260
  ## License
261
  This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
 
4
  - finance
5
  license: mit
6
  datasets:
7
+ - learn-abc/banking-intent-dataset
8
  language:
9
  - en
10
  - bn
11
  base_model:
12
  - google/muril-base-cased
13
+ metrics:
14
+ - accuracy
15
+ pipeline_tag: text-classification
16
+ ---
17
+
18
+
19
  ---
20
 
21
  # Banking Multilingual Intent Classifier
22
 
23
+ **Model Name:** Banking Multilingual Intent Classifier
24
+ **Base Model:** google/muril-base-cased
25
+ **Task:** Multilingual Intent Classification
26
+ **Intents:** 14
27
+ **Languages:** English, Bangla (Bengali script), Banglish (Romanized Bengali), Code-Mixed
28
 
29
  ---
30
 
31
+ ## 1. Model Overview
32
+
33
+ This model is a multilingual intent classifier designed for production-grade banking chatbot systems. It supports English, Bangla (Bengali script), and Banglish, including limited code-mixed input.
34
+
35
+ The model classifies user queries into 14 banking-specific intents with strong fallback detection for out-of-domain queries.
36
+
37
+ ---
38
 
39
+ ## 2. Intended Use
40
 
41
+ ### Primary Use Case
42
 
43
+ * Banking virtual assistants
44
+ * Customer support chatbots
45
  * Voice-to-intent pipelines
46
+ * Multilingual conversational banking systems
47
+
48
+ ### Supported Capabilities
49
+
50
+ * Transaction queries
51
+ * Balance inquiries
52
+ * Card management
53
+ * Lost/stolen reporting
54
+ * Fee clarification
55
+ * ATM issues
56
+ * Account updates
57
+ * General banking information
58
+ * Robust fallback detection for non-banking queries
59
 
60
  ---
61
 
62
+ ## 3. Dataset Summary
63
+
64
+ ### Total Samples
65
+
66
+ 110,364 original samples
67
+
68
+ * 500 additional code-mixed samples
69
+ * training augmentation
70
+
71
+ Final training size:
72
+
73
+ * Train: 99,273
74
+ * Test: 22,173
75
+
76
+ ### Language Distribution
77
+
78
+ | Language | Count |
79
+ | ---------- | ------ |
80
+ | English | 36,788 |
81
+ | Bangla | 36,788 |
82
+ | Banglish | 36,788 |
83
+ | Code-Mixed | ~0.45% |
84
+
85
+ Balanced across main three languages.
86
+
87
+ ---
88
+
89
+ ## 4. Intent Classes
90
+
91
+ Total Intents: 14
92
 
93
  ```
94
  ACCOUNT_INFO
 
109
 
110
  ---
111
 
112
+ ## 5. Data Characteristics
113
 
114
+ * Stratified 80/20 split
115
+ * Balanced language distribution
116
+ * Weighted loss for class imbalance
117
+ * Lowercase augmentation applied
118
+ * Hard negative examples included for:
 
 
 
 
119
 
120
+ * General knowledge
121
+ * Math queries
122
+ * Stock/crypto
123
+ * Biography queries
124
+ * Metaphorical financial language
125
+ * Government and legal topics
126
 
127
+ FALLBACK class strengthened for production safety.
128
 
129
  ---
130
 
131
+ ## 6. Training Configuration
132
 
133
+ Base Model: MuRIL (Multilingual BERT for Indic languages)
 
134
 
135
+ Hyperparameters:
136
 
137
+ * Epochs: 5
138
+ * Batch Size: 16 (with gradient accumulation = 2)
139
+ * Learning Rate: 5e-5
140
+ * Scheduler: Cosine
141
+ * Weight Decay: 0.01
142
+ * Early Stopping Enabled
143
+ * Weighted Cross Entropy Loss
144
 
145
+ Max Sequence Length: 64
 
 
 
 
 
146
 
147
+ Hardware: GPU (CUDA)
148
 
149
  ---
150
 
151
+ ## 7. Evaluation Results
152
+
153
+ ### Overall Performance (Test Set: 22,173 samples)
154
+
155
+ * Accuracy: 98.36%
156
+ * F1 Micro: 98.36%
157
+ * F1 Macro: 98.21%
158
+
159
+ ### Accuracy by Intent
160
+
161
+ | Intent | Accuracy |
162
+ | --------------------- | -------- |
163
+ | ACCOUNT_INFO | 99.27% |
164
+ | ATM_SUPPORT | 99.08% |
165
+ | CARD_ISSUE | 99.15% |
166
+ | CARD_MANAGEMENT | 98.70% |
167
+ | CARD_REPLACEMENT | 99.55% |
168
+ | CHECK_BALANCE | 97.77% |
169
+ | EDIT_PERSONAL_DETAILS | 99.66% |
170
+ | FAILED_TRANSFER | 98.62% |
171
+ | FALLBACK | 97.04% |
172
+ | FEES | 99.58% |
173
+ | GREETING | 95.02% |
174
+ | LOST_OR_STOLEN_CARD | 98.43% |
175
+ | MINI_STATEMENT | 98.56% |
176
+ | TRANSFER | 99.25% |
177
 
178
  ---
179
 
180
+ ## 8. Strengths
181
 
182
+ * Strong multilingual generalization
183
+ * High performance on transactional intents
184
+ * Robust fallback detection for out-of-domain queries
185
+ * Resistant to keyword leakage
186
+ * Stable performance across English, Bangla, Banglish
187
+ * Class imbalance handled using weighted loss
188
+ * Production-safe fallback tuning
189
 
190
  ---
191
 
192
+ ## 9. Known Limitations
 
 
 
 
 
 
 
 
 
 
 
193
 
194
+ * Very short ambiguous inputs may drift between:
195
 
196
+ * GREETING
197
+ * FALLBACK
198
+ * Highly ambiguous informational queries may overlap between:
199
 
200
+ * MINI_STATEMENT
201
+ * ACCOUNT_INFO
202
+ * Code-mixed coverage is limited compared to core languages
203
+ * Model not optimized for long multi-turn conversational memory
 
204
 
205
  ---
206
 
207
+ ## 10. Safety & Risk Considerations
 
 
 
 
 
 
208
 
209
+ * Model prioritizes safe fallback over risky misclassification.
210
+ * Non-banking queries are correctly routed to FALLBACK.
211
+ * Reduces risk of executing unintended financial actions.
212
 
213
+ Recommended Production Safeguards:
214
 
215
+ * Confidence threshold filtering
216
+ * Human fallback escalation for low-confidence cases
217
+ * Logging for monitoring drift
 
218
 
219
  ---
220
 
221
+ ## 11. Inference Performance
222
 
223
+ * Evaluation throughput: ~950 samples/sec
224
+ * GPU inference optimized
225
+ * Suitable for real-time chatbot systems
 
 
 
226
 
227
  ---
228
 
229
+ ## 12. Version
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
230
 
231
+ Version: 6.0
232
+ Status: Production-ready with monitoring
233
+ Last Evaluated: Epoch 5
 
 
 
 
 
 
 
 
 
 
 
234
 
235
  ---
236
 
237
+ ## 13. Suggested Deployment Architecture
238
 
239
+ Recommended stack:
240
 
241
+ User Input
242
+ Language detection (optional)
243
+ Intent classifier (this model)
244
+ Confidence threshold
245
+ → Business logic router
246
+ → Response generator
247
 
248
+ ---
249
 
250
  ## License
251
  This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.