Abhimanyu345 commited on
Commit
daafad7
·
verified ·
1 Parent(s): 2964108

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +87 -26
README.md CHANGED
@@ -1,35 +1,64 @@
1
  ---
2
- language: en
 
3
  license: apache-2.0
4
  tags:
5
- - text-classification
6
- - distilbert
7
- - customer-support
8
- pipeline_tag: text-classification
9
- base_model: distilbert-base-uncased
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  ---
11
 
12
  # Customer Support Ticket Classifier
13
 
14
- Fine-tuned `distilbert-base-uncased` for classifying customer support tickets
15
- into 5 categories.
 
 
 
 
 
16
 
17
  ## Labels
18
 
19
  | ID | Label |
20
  |----|-------|
21
- | 0 | Billing Inquiry |
22
- | 1 | Cancellation Request |
23
- | 2 | Product Inquiry |
24
- | 3 | Refund Request |
25
- | 4 | Technical Issue |
26
 
27
- ## Performance (Test Set, n=3750)
28
 
29
  | Metric | Value |
30
  |--------|-------|
31
- | Accuracy | 99.0% |
32
- | Macro F1 | 0.989 |
 
 
33
 
34
  ## Usage
35
 
@@ -41,19 +70,51 @@ classifier = pipeline(
41
  model="abhimanyu345/ticket-classifier"
42
  )
43
 
44
- classifier("I was charged twice this month")
45
- # [{'label': 'Billing Inquiry', 'score': 0.9996}]
 
46
  ```
47
 
48
  ## Training Details
49
 
50
- - **Dataset:** Twitter Customer Support Dataset (Kaggle), 25,000 examples balanced across 5 classes
51
- - **Training split:** 70/15/15 (train/val/test)
52
- - **Epochs:** 4 | **Batch size:** 32 | **LR:** 3e-5
53
- - **Platform:** Google Colab T4 GPU
54
- - **Experiment tracking:** [WandB Dashboard](https://api.wandb.ai/links/abhimanyu001-prom-iit-rajasthan/yttp7n7v)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
55
 
56
- ## Links
57
 
58
- - GitHub: [ticket-classifier](https://github.com/abhimanyu345/ticket-classifier)
59
- - WandB: [Experiment Dashboard](https://api.wandb.ai/links/abhimanyu001-prom-iit-rajasthan/yttp7n7v)
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - en
4
  license: apache-2.0
5
  tags:
6
+ - text-classification
7
+ - customer-support
8
+ - distilbert
9
+ - transformers
10
+ - mlops
11
+ datasets:
12
+ - thoughtvector/customer-support-on-twitter
13
+ metrics:
14
+ - accuracy
15
+ - f1
16
+ model-index:
17
+ - name: ticket-classifier
18
+ results:
19
+ - task:
20
+ type: text-classification
21
+ name: Text Classification
22
+ dataset:
23
+ name: Customer Support on Twitter
24
+ type: thoughtvector/customer-support-on-twitter
25
+ metrics:
26
+ - type: accuracy
27
+ value: 0.99
28
+ name: Test Accuracy
29
+ - type: f1
30
+ value: 0.989
31
+ name: Macro F1
32
  ---
33
 
34
  # Customer Support Ticket Classifier
35
 
36
+ Fine-tuned **DistilBERT** model for classifying customer support tickets into 5 categories.
37
+
38
+ ## Model Description
39
+
40
+ This model is a fine-tuned version of `distilbert-base-uncased` trained on real customer support tweets from the [Customer Support on Twitter](https://www.kaggle.com/datasets/thoughtvector/customer-support-on-twitter) dataset.
41
+
42
+ Developed as part of the **MLDLOps Course Project** at IIT Rajasthan by Abhimanyu Gupta (B22BB001).
43
 
44
  ## Labels
45
 
46
  | ID | Label |
47
  |----|-------|
48
+ | 0 | Billing inquiry |
49
+ | 1 | Cancellation request |
50
+ | 2 | Product inquiry |
51
+ | 3 | Refund request |
52
+ | 4 | Technical issue |
53
 
54
+ ## Performance
55
 
56
  | Metric | Value |
57
  |--------|-------|
58
+ | Test Accuracy | **99.0%** |
59
+ | Macro F1 | **0.989** |
60
+ | Training Time | ~4.5 min (T4 GPU) |
61
+ | Inference Latency | ~60ms (CPU) |
62
 
63
  ## Usage
64
 
 
70
  model="abhimanyu345/ticket-classifier"
71
  )
72
 
73
+ result = classifier("I was charged twice for my subscription this month")
74
+ print(result)
75
+ # [{'label': 'Billing inquiry', 'score': 0.9996}]
76
  ```
77
 
78
  ## Training Details
79
 
80
+ - **Base model:** distilbert-base-uncased
81
+ - **Learning rate:** 3e-5
82
+ - **Batch size:** 32
83
+ - **Epochs:** 4
84
+ - **Max sequence length:** 128
85
+ - **Training platform:** Google Colab T4 GPU
86
+ - **Experiment tracking:** [WandB Project](https://api.wandb.ai/links/abhimanyu001-prom-iit-rajasthan/yttp7n7v)
87
+
88
+ ## Dataset
89
+
90
+ - **Source:** Twitter Customer Support dataset (2.8M tweets)
91
+ - **After filtering:** 658,787 labeled examples
92
+ - **After balancing:** 25,000 examples (5,000 per class)
93
+ - **Split:** 70% train / 15% val / 15% test
94
+
95
+ ## MLOps Pipeline
96
+
97
+ Full production pipeline including:
98
+
99
+ - **DVC** — data versioning
100
+ - **WandB** — experiment tracking
101
+ - **FastAPI** — model serving
102
+ - **Docker** — containerization
103
+ - **Prometheus** — metrics monitoring
104
+ - **Evidently AI** — drift detection
105
+ - **GitHub Actions** — CI/CD
106
+
107
+ **GitHub Repository:** https://github.com/abhimanyu345/ticket-classifier
108
 
109
+ ## Citation
110
 
111
+ ```bibtex
112
+ @misc{gupta2026ticketclassifier,
113
+ author = {Abhimanyu Gupta},
114
+ title = {Customer Support Ticket Classifier with MLOps Pipeline},
115
+ year = {2026},
116
+ publisher = {HuggingFace},
117
+ journal = {HuggingFace Model Hub},
118
+ howpublished = {\url{https://huggingface.co/abhimanyu345/ticket-classifier}}
119
+ }
120
+ ```