learn-abc
/

banking-multilingual-intent-classifier

@@ -4,38 +4,91 @@ tags:
 - finance
 license: mit
 datasets:
-- PolyAI/banking77
 language:
 - en
 - bn
 base_model:
 - google/muril-base-cased
 ---
 # Banking Multilingual Intent Classifier
-- **Repository:** `learn-abc/banking-multilingual-intent-classifier`
-- **Base Model:** `google/muril-base-cased`
-- **Task:** Multilingual Intent Classification (Banking Domain)
-- **Languages:** English, Bangla (bn), Bangla Latin (bn-latn), Code-Mixed
 ---
-# Model Overview
-This model is a multilingual banking intent classifier fine-tuned on a balanced English–Bangla–Banglish dataset derived from Banking77 and extended with synthetic code-mixed augmentation.
-It is designed for:
-* AI banking assistants
-* Multilingual chatbots
 * Voice-to-intent pipelines
-* Intent routing systems
-* Hybrid Bangla-English financial applications
 ---
-# Supported Intents (14 Classes)
 ```
 ACCOUNT_INFO
@@ -56,206 +109,143 @@ TRANSFER
 ---
-# Dataset Details
-### Total Samples
-66,768
-### Language Distribution
-* English (en): 22,256
-* Bangla (bn): 22,256
-* Bangla Latin (bn-latn): 22,256
-### Code-Mixed Augmentation
-* 2,500 synthetic code-mixed examples added
 ---
-### Final Training Split
-* Train: 63,306
-* Test: 13,854
----
-# Training Configuration
-* Base Model: `google/muril-base-cased`
-* Architecture: `BertForSequenceClassification`
-* Epochs: 7
-* Class weights applied to address imbalance
-* Tokenizer: MuRIL tokenizer
-* Framework: Hugging Face Transformers
-Note: Some classifier layers were newly initialized (expected when adapting base MuRIL to classification head).
 ---
-# Evaluation Results
-## Overall Performance
-| Metric    | Score      |
-| --------- | ---------- |
-| Accuracy  | **99.57%** |
-| F1 Micro  | **0.9957** |
-| F1 Macro  | **0.9959** |
-| Eval Loss | 0.0178     |
-- Evaluation runtime: 10.1 seconds
-- Samples/sec: 1365
 ---
-## Language-wise Performance
-| Language     | Accuracy |
-| ------------ | -------- |
-| English      | 99.26%   |
-| Bangla       | 99.80%   |
-| Bangla Latin | 99.62%   |
-| Code-Mixed   | 100.00%  |
 ---
-# Multilingual Prediction Examples
-| Input                         | Language   | Prediction          |
-| ----------------------------- | ---------- | ------------------- |
-| what is my balance            | en         | CHECK_BALANCE       |
-| আমার ব্যালেন্স কত             | bn         | CHECK_BALANCE       |
-| amar balance koto ache        | bn-latn    | CHECK_BALANCE       |
-| আমার balance দেখাও | code-mixed | CHECK_BALANCE       |
-| card ta hariye geche          | bn-latn    | LOST_OR_STOLEN_CARD |
-| weather kemon                 | code-mixed | FALLBACK            |
-All tested predictions returned high confidence (~1.000).
----
-# Intended Use Cases
-* Banking chatbot intent routing
-* Voice assistant → STT → Intent classification
-* Multilingual customer support
-* Code-mixed South Asian applications
-* Fintech AI pipelines
 ---
-# Limitations
-1. Domain-specific: Focused only on banking intents.
-2. Synthetic augmentation: Code-mixed data partially generated programmatically.
-3. Overconfidence: Softmax confidence may saturate near 1.0.
-4. Not tested on adversarial or out-of-distribution queries.
-5. Not designed for generative responses, classification only.
----
-# Architecture Notes
-* Based on MuRIL, optimized for Indian languages.
-* Classification head added on top of encoder.
-* Some warnings regarding unexpected/missing keys are normal due to task adaptation.
-* Class weights applied to handle skewed distribution.
 ---
-# Bias & Fairness
-* Balanced across 3 language representations.
-* Augmented for code-mixed robustness.
-* May not generalize to:
-  * Non-banking domains
-  * Slang-heavy dialects outside training distribution
 ---
-# Example Usage
-```python
-from transformers import AutoTokenizer, AutoModelForSequenceClassification
-import torch
-# Load model and tokenizer
-model_name = "learn-abc/banking-multilingual-intent-classifier"
-tokenizer = AutoTokenizer.from_pretrained(model_name)
-model = AutoModelForSequenceClassification.from_pretrained(model_name)
-device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
-model.to(device)
-model.eval()
-# Prediction function
-def predict_intent(text):
-    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=64)
-    inputs = {k: v.to(device) for k, v in inputs.items()}
-    with torch.no_grad():
-        outputs = model(**inputs)
-        prediction = torch.argmax(outputs.logits, dim=-1).item()
-        confidence = torch.softmax(outputs.logits, dim=-1)[0][prediction].item()
-    predicted_intent = model.config.id2label[prediction]
-    return {
-        "intent": predicted_intent,
-        "confidence": confidence
-    }
-# Example usage - English
-result = predict_intent("what is my balance")
-print(f"Intent: {result['intent']}, Confidence: {result['confidence']:.2f}")
-# Output: Intent: CHECK_BALANCE, Confidence: 0.99
-# Example usage - Bangla
-result = predict_intent("আমার ব্যালেন্স কত")
-print(f"Intent: {result['intent']}, Confidence: {result['confidence']:.2f}")
-# Output: Intent: CHECK_BALANCE, Confidence: 0.98
-# Example usage - Banglish (Romanized)
-result = predict_intent("amar balance koto ache")
-print(f"Intent: {result['intent']}, Confidence: {result['confidence']:.2f}")
-# Output: Intent: CHECK_BALANCE, Confidence: 0.97
-# Example usage - Code-mixed
-result = predict_intent("আমার last 10 transaction দেখাও")
-print(f"Intent: {result['intent']}, Confidence: {result['confidence']:.2f}")
-# Output: Intent: MINI_STATEMENT, Confidence: 0.98
-```
----
-# Production Recommendations
-For real-world deployment:
-* Add confidence threshold fallback
-* Add OOD detector
-* Combine with:
-  * STT system
-  * Intent router
-  * Business rule engine
-* Log misclassifications for continual fine-tuning
 ---
-# Summary
-This model achieves near state-of-the-art multilingual intent classification accuracy for banking-specific queries across:
-* English
-* Bangla (native script)
-* Bangla Latin
-* Code-mixed variants
-It is optimized for fintech AI systems targeting South Asian multilingual users.
 ## License
 This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

 - finance
 license: mit
 datasets:
+- learn-abc/banking-intent-dataset
 language:
 - en
 - bn
 base_model:
 - google/muril-base-cased
+metrics:
+- accuracy
+pipeline_tag: text-classification
+---
 ---
 # Banking Multilingual Intent Classifier
+**Model Name:** Banking Multilingual Intent Classifier
+**Base Model:** google/muril-base-cased
+**Task:** Multilingual Intent Classification
+**Intents:** 14
+**Languages:** English, Bangla (Bengali script), Banglish (Romanized Bengali), Code-Mixed
 ---
+## 1. Model Overview
+This model is a multilingual intent classifier designed for production-grade banking chatbot systems. It supports English, Bangla (Bengali script), and Banglish, including limited code-mixed input.
+The model classifies user queries into 14 banking-specific intents with strong fallback detection for out-of-domain queries.
+---
+## 2. Intended Use
+### Primary Use Case
+* Banking virtual assistants
+* Customer support chatbots
 * Voice-to-intent pipelines
+* Multilingual conversational banking systems
+### Supported Capabilities
+* Transaction queries
+* Balance inquiries
+* Card management
+* Lost/stolen reporting
+* Fee clarification
+* ATM issues
+* Account updates
+* General banking information
+* Robust fallback detection for non-banking queries
 ---
+## 3. Dataset Summary
+### Total Samples
+110,364 original samples
+* 500 additional code-mixed samples
+* training augmentation
+Final training size:
+* Train: 99,273
+* Test: 22,173
+### Language Distribution
+| Language   | Count  |
+| ---------- | ------ |
+| English    | 36,788 |
+| Bangla     | 36,788 |
+| Banglish   | 36,788 |
+| Code-Mixed | ~0.45% |
+Balanced across main three languages.
+---
+## 4. Intent Classes
+Total Intents: 14
 ```
 ACCOUNT_INFO
 ---
+## 5. Data Characteristics
+* Stratified 80/20 split
+* Balanced language distribution
+* Weighted loss for class imbalance
+* Lowercase augmentation applied
+* Hard negative examples included for:
+  * General knowledge
+  * Math queries
+  * Stock/crypto
+  * Biography queries
+  * Metaphorical financial language
+  * Government and legal topics
+FALLBACK class strengthened for production safety.
 ---
+## 6. Training Configuration
+Base Model: MuRIL (Multilingual BERT for Indic languages)
+Hyperparameters:
+* Epochs: 5
+* Batch Size: 16 (with gradient accumulation = 2)
+* Learning Rate: 5e-5
+* Scheduler: Cosine
+* Weight Decay: 0.01
+* Early Stopping Enabled
+* Weighted Cross Entropy Loss
+Max Sequence Length: 64
+Hardware: GPU (CUDA)
 ---
+## 7. Evaluation Results
+### Overall Performance (Test Set: 22,173 samples)
+* Accuracy: 98.36%
+* F1 Micro: 98.36%
+* F1 Macro: 98.21%
+### Accuracy by Intent
+| Intent                | Accuracy |
+| --------------------- | -------- |
+| ACCOUNT_INFO          | 99.27%   |
+| ATM_SUPPORT           | 99.08%   |
+| CARD_ISSUE            | 99.15%   |
+| CARD_MANAGEMENT       | 98.70%   |
+| CARD_REPLACEMENT      | 99.55%   |
+| CHECK_BALANCE         | 97.77%   |
+| EDIT_PERSONAL_DETAILS | 99.66%   |
+| FAILED_TRANSFER       | 98.62%   |
+| FALLBACK              | 97.04%   |
+| FEES                  | 99.58%   |
+| GREETING              | 95.02%   |
+| LOST_OR_STOLEN_CARD   | 98.43%   |
+| MINI_STATEMENT        | 98.56%   |
+| TRANSFER              | 99.25%   |
 ---
+## 8. Strengths
+* Strong multilingual generalization
+* High performance on transactional intents
+* Robust fallback detection for out-of-domain queries
+* Resistant to keyword leakage
+* Stable performance across English, Bangla, Banglish
+* Class imbalance handled using weighted loss
+* Production-safe fallback tuning
 ---
+## 9. Known Limitations
+* Very short ambiguous inputs may drift between:
+  * GREETING
+  * FALLBACK
+* Highly ambiguous informational queries may overlap between:
+  * MINI_STATEMENT
+  * ACCOUNT_INFO
+* Code-mixed coverage is limited compared to core languages
+* Model not optimized for long multi-turn conversational memory
 ---
+## 10. Safety & Risk Considerations
+* Model prioritizes safe fallback over risky misclassification.
+* Non-banking queries are correctly routed to FALLBACK.
+* Reduces risk of executing unintended financial actions.
+Recommended Production Safeguards:
+* Confidence threshold filtering
+* Human fallback escalation for low-confidence cases
+* Logging for monitoring drift
 ---
+## 11. Inference Performance
+* Evaluation throughput: ~950 samples/sec
+* GPU inference optimized
+* Suitable for real-time chatbot systems
 ---
+## 12. Version
+Version: 6.0
+Status: Production-ready with monitoring
+Last Evaluated: Epoch 5
 ---
+## 13. Suggested Deployment Architecture
+Recommended stack:
+User Input
+→ Language detection (optional)
+→ Intent classifier (this model)
+→ Confidence threshold
+→ Business logic router
+→ Response generator
+---
 ## License
 This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.