f1rdavs
/

tajik-banking-intent-classifier

Text Classification

Generated from Trainer

text-embeddings-inference

Model card Files Files and versions

Metrics Training metrics Community

f1rdavs commited on Jul 8, 2025

Commit

ada6b5e

·

verified ·

1 Parent(s): ce86d77

Update README.md

Files changed (1) hide show

README.md +26 -13

README.md CHANGED Viewed

@@ -7,6 +7,10 @@ tags:
 model-index:
 - name: tajik-classifier
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -14,21 +18,34 @@ should probably proofread and complete it, then remove this comment. -->
 # tajik-classifier
-This model is a fine-tuned version of [xlm-roberta-base](https://huggingface.co/xlm-roberta-base) on an unknown dataset.
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
 ### Training hyperparameters
@@ -42,13 +59,9 @@ The following hyperparameters were used during training:
 - num_epochs: 5
 - mixed_precision_training: Native AMP
-### Training results
 ### Framework versions
 - Transformers 4.52.4
 - Pytorch 2.6.0+cu124
 - Datasets 3.6.0
-- Tokenizers 0.21.1

 model-index:
 - name: tajik-classifier
   results: []
+datasets:
+- mteb/banking77
+language:
+- tg
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 # tajik-classifier
+This model is a fine-tuned version of [xlm-roberta-base](https://huggingface.co/xlm-roberta-base) trained on a Tajik-translated version of the [Banking77](https://huggingface.co/datasets/mteb/banking77) dataset.
+The dataset contains customer service queries related to banking, classified into 77 different intent categories.
+## 🧾 Model description
+* **Base model**: XLM-RoBERTa Base
+* **Language**: Tajik (tg)
+* **Task**: Text classification (intent recognition)
+* **Number of classes**: 77
+<p>The model is designed to classify banking-related queries into one of 77 categories such as card_payment, atm_support, balance, lost_or_stolen_card, etc. It is useful for building customer support bots or virtual assistants that operate in the Tajik language.</p>
+## ✅ Intended uses
+* Banking customer support chatbots for Tajik-speaking users
+* Voice or text-based virtual assistants in the finance domain
+* Automated ticket or query routing in Tajik financial services
+## ⚠️ Limitations
+* The model may not generalize well to non-banking topics
+* Classification performance depends on the quality and accuracy of the dataset translation
+## 📚 Training and evaluation data
+* **Dataset**: Banking77 dataset translated from English to Tajik
+* **Size**: ~13,000 examples across 77 intent classes
+* **Source**: Original banking77 English dataset, translated via machine translation
+## ⚙️ Training procedure
 ### Training hyperparameters
 - num_epochs: 5
 - mixed_precision_training: Native AMP
 ### Framework versions
 - Transformers 4.52.4
 - Pytorch 2.6.0+cu124
 - Datasets 3.6.0
+- Tokenizers 0.21.1