Abhimanyu345
/

ticket-classifier

Text Classification

customer-support

Eval Results (legacy)

text-embeddings-inference

Model card Files Files and versions

ticket-classifier / README.md

Abhimanyu345's picture

Update README.md

daafad7 verified about 1 month ago

|

history blame contribute delete

2.93 kB

	---
	language:
	- en
	license: apache-2.0
	tags:
	- text-classification
	- customer-support
	- distilbert
	- transformers
	- mlops
	datasets:
	- thoughtvector/customer-support-on-twitter
	metrics:
	- accuracy
	- f1
	model-index:
	- name: ticket-classifier
	results:
	- task:
	type: text-classification
	name: Text Classification
	dataset:
	name: Customer Support on Twitter
	type: thoughtvector/customer-support-on-twitter
	metrics:
	- type: accuracy
	value: 0.99
	name: Test Accuracy
	- type: f1
	value: 0.989
	name: Macro F1
	---

	# Customer Support Ticket Classifier

	Fine-tuned DistilBERT model for classifying customer support tickets into 5 categories.

	## Model Description

	This model is a fine-tuned version of `distilbert-base-uncased` trained on real customer support tweets from the [Customer Support on Twitter](https://www.kaggle.com/datasets/thoughtvector/customer-support-on-twitter) dataset.

	Developed as part of the MLDLOps Course Project at IIT Rajasthan by Abhimanyu Gupta (B22BB001).

	## Labels

	\| ID \| Label \|
	\|----\|-------\|
	\| 0 \| Billing inquiry \|
	\| 1 \| Cancellation request \|
	\| 2 \| Product inquiry \|
	\| 3 \| Refund request \|
	\| 4 \| Technical issue \|

	## Performance

	\| Metric \| Value \|
	\|--------\|-------\|
	\| Test Accuracy \| 99.0% \|
	\| Macro F1 \| 0.989 \|
	\| Training Time \| ~4.5 min (T4 GPU) \|
	\| Inference Latency \| ~60ms (CPU) \|

	## Usage

	```python
	from transformers import pipeline

	classifier = pipeline(
	"text-classification",
	model="abhimanyu345/ticket-classifier"
	)

	result = classifier("I was charged twice for my subscription this month")
	print(result)
	# [{'label': 'Billing inquiry', 'score': 0.9996}]
	```

	## Training Details

	- Base model: distilbert-base-uncased
	- Learning rate: 3e-5
	- Batch size: 32
	- Epochs: 4
	- Max sequence length: 128
	- Training platform: Google Colab T4 GPU
	- Experiment tracking: [WandB Project](https://api.wandb.ai/links/abhimanyu001-prom-iit-rajasthan/yttp7n7v)

	## Dataset

	- Source: Twitter Customer Support dataset (2.8M tweets)
	- After filtering: 658,787 labeled examples
	- After balancing: 25,000 examples (5,000 per class)
	- Split: 70% train / 15% val / 15% test

	## MLOps Pipeline

	Full production pipeline including:

	- DVC — data versioning
	- WandB — experiment tracking
	- FastAPI — model serving
	- Docker — containerization
	- Prometheus — metrics monitoring
	- Evidently AI — drift detection
	- GitHub Actions — CI/CD

	GitHub Repository: https://github.com/abhimanyu345/ticket-classifier

	## Citation

	```bibtex
	@misc{gupta2026ticketclassifier,
	author = {Abhimanyu Gupta},
	title = {Customer Support Ticket Classifier with MLOps Pipeline},
	year = {2026},
	publisher = {HuggingFace},
	journal = {HuggingFace Model Hub},
	howpublished = {\url{https://huggingface.co/abhimanyu345/ticket-classifier}}
	}
	```