ticket-classifier / README.md
Abhimanyu345's picture
Update README.md
daafad7 verified
---
language:
- en
license: apache-2.0
tags:
- text-classification
- customer-support
- distilbert
- transformers
- mlops
datasets:
- thoughtvector/customer-support-on-twitter
metrics:
- accuracy
- f1
model-index:
- name: ticket-classifier
results:
- task:
type: text-classification
name: Text Classification
dataset:
name: Customer Support on Twitter
type: thoughtvector/customer-support-on-twitter
metrics:
- type: accuracy
value: 0.99
name: Test Accuracy
- type: f1
value: 0.989
name: Macro F1
---
# Customer Support Ticket Classifier
Fine-tuned **DistilBERT** model for classifying customer support tickets into 5 categories.
## Model Description
This model is a fine-tuned version of `distilbert-base-uncased` trained on real customer support tweets from the [Customer Support on Twitter](https://www.kaggle.com/datasets/thoughtvector/customer-support-on-twitter) dataset.
Developed as part of the **MLDLOps Course Project** at IIT Rajasthan by Abhimanyu Gupta (B22BB001).
## Labels
| ID | Label |
|----|-------|
| 0 | Billing inquiry |
| 1 | Cancellation request |
| 2 | Product inquiry |
| 3 | Refund request |
| 4 | Technical issue |
## Performance
| Metric | Value |
|--------|-------|
| Test Accuracy | **99.0%** |
| Macro F1 | **0.989** |
| Training Time | ~4.5 min (T4 GPU) |
| Inference Latency | ~60ms (CPU) |
## Usage
```python
from transformers import pipeline
classifier = pipeline(
"text-classification",
model="abhimanyu345/ticket-classifier"
)
result = classifier("I was charged twice for my subscription this month")
print(result)
# [{'label': 'Billing inquiry', 'score': 0.9996}]
```
## Training Details
- **Base model:** distilbert-base-uncased
- **Learning rate:** 3e-5
- **Batch size:** 32
- **Epochs:** 4
- **Max sequence length:** 128
- **Training platform:** Google Colab T4 GPU
- **Experiment tracking:** [WandB Project](https://api.wandb.ai/links/abhimanyu001-prom-iit-rajasthan/yttp7n7v)
## Dataset
- **Source:** Twitter Customer Support dataset (2.8M tweets)
- **After filtering:** 658,787 labeled examples
- **After balancing:** 25,000 examples (5,000 per class)
- **Split:** 70% train / 15% val / 15% test
## MLOps Pipeline
Full production pipeline including:
- **DVC** β€” data versioning
- **WandB** β€” experiment tracking
- **FastAPI** β€” model serving
- **Docker** β€” containerization
- **Prometheus** β€” metrics monitoring
- **Evidently AI** β€” drift detection
- **GitHub Actions** β€” CI/CD
**GitHub Repository:** https://github.com/abhimanyu345/ticket-classifier
## Citation
```bibtex
@misc{gupta2026ticketclassifier,
author = {Abhimanyu Gupta},
title = {Customer Support Ticket Classifier with MLOps Pipeline},
year = {2026},
publisher = {HuggingFace},
journal = {HuggingFace Model Hub},
howpublished = {\url{https://huggingface.co/abhimanyu345/ticket-classifier}}
}
```