| | --- |
| | library_name: transformers |
| | tags: |
| | - finance |
| | license: apache-2.0 |
| | datasets: |
| | - learn-abc/banking-intent-dataset |
| | language: |
| | - en |
| | - bn |
| | base_model: |
| | - google/muril-base-cased |
| | metrics: |
| | - accuracy |
| | pipeline_tag: text-classification |
| | --- |
| | |
| | # Multilingual Banking Intent Classifier (EN + BN + Banglish) |
| |
|
| | ## Overview |
| |
|
| | This model is a fine-tuned **MuRIL-based multilingual intent classifier** designed for production-grade banking chatbots. |
| |
|
| | - **Model Name:** Banking Multilingual Intent Classifier |
| | - **Base Model:** google/muril-base-cased |
| | - **Task:** Multilingual Intent Classification |
| | - **Intents:** 14 |
| | - **Languages:** English, Bangla (Bengali script), Banglish (Romanized Bengali), Code-Mixed |
| |
|
| | The model performs 14-way intent classification for banking conversational systems. |
| |
|
| | --- |
| |
|
| | ## Base Model |
| |
|
| | `google/muril-base-cased` |
| |
|
| | MuRIL was selected for: |
| |
|
| | * Strong multilingual support |
| | * Excellent performance on Indic languages |
| | * Stable tokenization for Bangla + English |
| | * Robust handling of code-mixed inputs |
| |
|
| | --- |
| |
|
| | ## Supported Intents (14) |
| |
|
| | ``` |
| | ACCOUNT_INFO |
| | ATM_SUPPORT |
| | CARD_ISSUE |
| | CARD_MANAGEMENT |
| | CARD_REPLACEMENT |
| | CHECK_BALANCE |
| | EDIT_PERSONAL_DETAILS |
| | FAILED_TRANSFER |
| | FALLBACK |
| | FEES |
| | GREETING |
| | LOST_OR_STOLEN_CARD |
| | MINI_STATEMENT |
| | TRANSFER |
| | ``` |
| |
|
| | --- |
| |
|
| | ## Dataset Summary |
| |
|
| | ### Total Samples: 100,971 |
| |
|
| | ### Languages (Balanced) |
| |
|
| | | Language | Count | |
| | | ------------------ | ------ | |
| | | English (en) | 33,657 | |
| | | Bangla (bn) | 33,657 | |
| | | Banglish (bn-latn) | 33,657 | |
| |
|
| | Additional 500 code-mixed examples included. |
| |
|
| | --- |
| |
|
| | ## Final Training Dataset |
| |
|
| | | Split | Samples | |
| | | ----- | ------- | |
| | | Train | 91,051 | |
| | | Test | 20,295 | |
| |
|
| | ### Class Distribution (Final Train) |
| |
|
| | - All intents are within a safe 4–10% range. |
| | - FALLBACK is controlled at ~9.4%, preventing dominance. |
| | - This distribution avoids class collapse and overconfidence bias. |
| |
|
| | --- |
| |
|
| | ## Evaluation Metrics |
| |
|
| | ### Overall Performance |
| |
|
| | * Accuracy: **99.12%** |
| | * F1 Micro: **99.12%** |
| | * F1 Macro: **99.08%** |
| | * Validation Loss: 0.046 |
| |
|
| | --- |
| |
|
| | ## Per-Intent Accuracy |
| |
|
| | | Intent | Accuracy | |
| | | --------------------- | -------- | |
| | | ACCOUNT_INFO | 99.14% | |
| | | ATM_SUPPORT | 99.70% | |
| | | CARD_ISSUE | 99.25% | |
| | | CARD_MANAGEMENT | 99.43% | |
| | | CARD_REPLACEMENT | 99.08% | |
| | | CHECK_BALANCE | 99.05% | |
| | | EDIT_PERSONAL_DETAILS | 100.00% | |
| | | FAILED_TRANSFER | 98.75% | |
| | | FALLBACK | 97.86% | |
| | | FEES | 99.76% | |
| | | GREETING | 97.41% | |
| | | LOST_OR_STOLEN_CARD | 99.59% | |
| | | MINI_STATEMENT | 98.80% | |
| | | TRANSFER | 99.78% | |
| | |
| | --- |
| | |
| | ## Strengths |
| | |
| | * Strong multilingual support |
| | * Balanced dataset distribution |
| | * Robust fallback handling |
| | * Stable across operational banking intents |
| | * High macro F1 ensures no minority intent collapse |
| | * Performs well on code-mixed queries |
| | |
| | --- |
| | |
| | ## Intended Use |
| | |
| | * Banking chatbot intent routing |
| | * Customer support automation |
| | * Financial conversational AI |
| | * Multilingual banking assistants |
| | |
| | --- |
| | |
| | ## Out of Scope |
| | |
| | * Fraud detection |
| | * Sentiment analysis |
| | * Financial advisory decisions |
| | * Regulatory or legal compliance automation |
| | |
| | --- |
| | |
| | ## Production Recommendations |
| | |
| | * Apply confidence thresholding |
| | * Route low-confidence predictions to human fallback |
| | * Use softmax entropy monitoring |
| | * Normalize numeric expressions before inference |
| | * Log confusion pairs in production |
| | |
| | --- |
| | |
| | ## Example Usage |
| | |
| | ```python |
| | from transformers import AutoTokenizer, AutoModelForSequenceClassification |
| | import torch |
| | |
| | # Load model and tokenizer |
| | model_name = "learn-abc/banking-multilingual-intent-classifier" |
| | tokenizer = AutoTokenizer.from_pretrained(model_name) |
| | model = AutoModelForSequenceClassification.from_pretrained(model_name) |
| |
|
| | device = torch.device("cuda" if torch.cuda.is_available() else "cpu") |
| | model.to(device) |
| | model.eval() |
| | |
| | # Prediction function |
| | def predict_intent(text): |
| | inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=64) |
| | inputs = {k: v.to(device) for k, v in inputs.items()} |
| | |
| | with torch.no_grad(): |
| | outputs = model(**inputs) |
| | prediction = torch.argmax(outputs.logits, dim=-1).item() |
| | confidence = torch.softmax(outputs.logits, dim=-1)[0][prediction].item() |
| | |
| | predicted_intent = model.config.id2label[prediction] |
| | |
| | return { |
| | "intent": predicted_intent, |
| | "confidence": confidence |
| | } |
| | |
| | # Example usage - English |
| | result = predict_intent("what is my balance") |
| | print(f"Intent: {result['intent']}, Confidence: {result['confidence']:.2f}") |
| | # Output: Intent: CHECK_BALANCE, Confidence: 0.99 |
| |
|
| | # Example usage - Bangla |
| | result = predict_intent("আমার ব্যালেন্স কত") |
| | print(f"Intent: {result['intent']}, Confidence: {result['confidence']:.2f}") |
| | # Output: Intent: CHECK_BALANCE, Confidence: 0.98 |
| |
|
| | # Example usage - Banglish (Romanized) |
| | result = predict_intent("amar balance koto ache") |
| | print(f"Intent: {result['intent']}, Confidence: {result['confidence']:.2f}") |
| | # Output: Intent: CHECK_BALANCE, Confidence: 0.97 |
| |
|
| | # Example usage - Code-mixed |
| | result = predict_intent("আমার last 10 transaction দেখাও") |
| | print(f"Intent: {result['intent']}, Confidence: {result['confidence']:.2f}") |
| | # Output: Intent: MINI_STATEMENT, Confidence: 0.98 |
| | ``` |
| | |
| | --- |
| | |
| | ## Limitations |
| | |
| | * Does not handle multi-turn conversational context |
| | * Extremely ambiguous short inputs may require thresholding |
| | * Synthetic data may introduce stylistic bias |
| | * No speech-to-text robustness included |
| | |
| | --- |
| | |
| | ## Version |
| | |
| | - Version: 2.0 |
| | - Status: Production-Ready |
| | - Architecture: MuRIL Base |
| | - Language Coverage: EN + BN + Banglish |
| | |
| | --- |
| | |
| | ## License |
| | This project is licensed under the Apache 2.0 License. |
| | |
| | ## Contact Me |
| | For any inquiries or support, please reach out to: |
| | |
| | * **Author:** [Abhishek Singh](https://github.com/SinghIsWriting/) |
| | * **LinkedIn:** [My LinkedIn Profile](https://www.linkedin.com/in/abhishek-singh-bba2662a9) |
| | * **Portfolio:** [Abhishek Singh Portfolio](https://me.devhome.me/) |
| | |
| | --- |