--- language: - en license: apache-2.0 tags: - text-classification - customer-support - distilbert - transformers - mlops datasets: - thoughtvector/customer-support-on-twitter metrics: - accuracy - f1 model-index: - name: ticket-classifier results: - task: type: text-classification name: Text Classification dataset: name: Customer Support on Twitter type: thoughtvector/customer-support-on-twitter metrics: - type: accuracy value: 0.99 name: Test Accuracy - type: f1 value: 0.989 name: Macro F1 --- # Customer Support Ticket Classifier Fine-tuned **DistilBERT** model for classifying customer support tickets into 5 categories. ## Model Description This model is a fine-tuned version of `distilbert-base-uncased` trained on real customer support tweets from the [Customer Support on Twitter](https://www.kaggle.com/datasets/thoughtvector/customer-support-on-twitter) dataset. Developed as part of the **MLDLOps Course Project** at IIT Rajasthan by Abhimanyu Gupta (B22BB001). ## Labels | ID | Label | |----|-------| | 0 | Billing inquiry | | 1 | Cancellation request | | 2 | Product inquiry | | 3 | Refund request | | 4 | Technical issue | ## Performance | Metric | Value | |--------|-------| | Test Accuracy | **99.0%** | | Macro F1 | **0.989** | | Training Time | ~4.5 min (T4 GPU) | | Inference Latency | ~60ms (CPU) | ## Usage ```python from transformers import pipeline classifier = pipeline( "text-classification", model="abhimanyu345/ticket-classifier" ) result = classifier("I was charged twice for my subscription this month") print(result) # [{'label': 'Billing inquiry', 'score': 0.9996}] ``` ## Training Details - **Base model:** distilbert-base-uncased - **Learning rate:** 3e-5 - **Batch size:** 32 - **Epochs:** 4 - **Max sequence length:** 128 - **Training platform:** Google Colab T4 GPU - **Experiment tracking:** [WandB Project](https://api.wandb.ai/links/abhimanyu001-prom-iit-rajasthan/yttp7n7v) ## Dataset - **Source:** Twitter Customer Support dataset (2.8M tweets) - **After filtering:** 658,787 labeled examples - **After balancing:** 25,000 examples (5,000 per class) - **Split:** 70% train / 15% val / 15% test ## MLOps Pipeline Full production pipeline including: - **DVC** — data versioning - **WandB** — experiment tracking - **FastAPI** — model serving - **Docker** — containerization - **Prometheus** — metrics monitoring - **Evidently AI** — drift detection - **GitHub Actions** — CI/CD **GitHub Repository:** https://github.com/abhimanyu345/ticket-classifier ## Citation ```bibtex @misc{gupta2026ticketclassifier, author = {Abhimanyu Gupta}, title = {Customer Support Ticket Classifier with MLOps Pipeline}, year = {2026}, publisher = {HuggingFace}, journal = {HuggingFace Model Hub}, howpublished = {\url{https://huggingface.co/abhimanyu345/ticket-classifier}} } ```