Update README.md

daafad7 verified about 1 month ago

2.93 kB

language:
  - en
license: apache-2.0
tags:
  - text-classification
  - customer-support
  - distilbert
  - transformers
  - mlops
datasets:
  - thoughtvector/customer-support-on-twitter
metrics:
  - accuracy
  - f1
model-index:
  - name: ticket-classifier
    results:
      - task:
          type: text-classification
          name: Text Classification
        dataset:
          name: Customer Support on Twitter
          type: thoughtvector/customer-support-on-twitter
        metrics:
          - type: accuracy
            value: 0.99
            name: Test Accuracy
          - type: f1
            value: 0.989
            name: Macro F1

Customer Support Ticket Classifier

Fine-tuned DistilBERT model for classifying customer support tickets into 5 categories.

Model Description

This model is a fine-tuned version of distilbert-base-uncased trained on real customer support tweets from the Customer Support on Twitter dataset.

Developed as part of the MLDLOps Course Project at IIT Rajasthan by Abhimanyu Gupta (B22BB001).

Labels

ID	Label
0	Billing inquiry
1	Cancellation request
2	Product inquiry
3	Refund request
4	Technical issue

Performance

Metric	Value
Test Accuracy	99.0%
Macro F1	0.989
Training Time	~4.5 min (T4 GPU)
Inference Latency	~60ms (CPU)

Usage

from transformers import pipeline

classifier = pipeline(
    "text-classification",
    model="abhimanyu345/ticket-classifier"
)

result = classifier("I was charged twice for my subscription this month")
print(result)
# [{'label': 'Billing inquiry', 'score': 0.9996}]

Training Details

Base model: distilbert-base-uncased
Learning rate: 3e-5
Batch size: 32
Epochs: 4
Max sequence length: 128
Training platform: Google Colab T4 GPU
Experiment tracking: WandB Project

Dataset

Source: Twitter Customer Support dataset (2.8M tweets)
After filtering: 658,787 labeled examples
After balancing: 25,000 examples (5,000 per class)
Split: 70% train / 15% val / 15% test

MLOps Pipeline

Full production pipeline including:

DVC — data versioning
WandB — experiment tracking
FastAPI — model serving
Docker — containerization
Prometheus — metrics monitoring
Evidently AI — drift detection
GitHub Actions — CI/CD

GitHub Repository: https://github.com/abhimanyu345/ticket-classifier

Citation

@misc{gupta2026ticketclassifier,
  author = {Abhimanyu Gupta},
  title = {Customer Support Ticket Classifier with MLOps Pipeline},
  year = {2026},
  publisher = {HuggingFace},
  journal = {HuggingFace Model Hub},
  howpublished = {\url{https://huggingface.co/abhimanyu345/ticket-classifier}}
}