Instructions to use pxlnstn/distilbert-ticket-routing with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use pxlnstn/distilbert-ticket-routing with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="pxlnstn/distilbert-ticket-routing")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("pxlnstn/distilbert-ticket-routing") model = AutoModelForSequenceClassification.from_pretrained("pxlnstn/distilbert-ticket-routing") - Notebooks
- Google Colab
- Kaggle
DistilBERT — Customer-Support Ticket Routing
Fine-tuned distilbert-base-uncased
that reads a customer-support ticket (subject + body) and predicts which of 10
departments should handle it. Built as a deep-learning term project.
Full project (code + paper + a CrewAI multi-agent demo): https://github.com/pxlnstn/distilbert-text-classification
Results (held-out test set)
| Metric | Score |
|---|---|
| Accuracy | 71.3% |
| Top-2 accuracy | 81.6% |
| Macro-F1 | 0.68 |
For reference, the majority-class baseline is ~29% and random is ~10% on these 10 imbalanced classes. Best configuration: 7 epochs, learning rate 3e-5, 256-token inputs.
How to use
from transformers import pipeline
clf = pipeline("text-classification", model="pxlnstn/distilbert-ticket-routing",
top_k=None)
print(clf("My internet has been down all morning and the VPN keeps disconnecting."))
# -> Technical Support (with IT Support close behind)
Labels (departments)
Billing and Payments, Customer Service, General Inquiry, Human Resources,
IT Support, Product Support, Returns and Exchanges, Sales and Pre-Sales,
Service Outages and Maintenance, Technical Support.
Training
- Base model:
distilbert-base-uncased(6 layers, ~66M parameters). - Data: English tickets from
Tobi-Bueck/customer-support-tickets(~28k rows), stratified 80/10/10 train/validation/test split. - Input: ticket
subject+body, truncated to 256 tokens. - Optimiser AdamW, FP16, batch size 16, 7 epochs, learning rate 3e-5.
- Trained on a single NVIDIA RTX 4070 Laptop GPU (8 GB).
Limitations
The data is synthetic and several departments overlap (Technical Support / IT Support / Product Support describe similar tickets), which caps achievable accuracy — this is why top-2 accuracy (82%) is much higher than top-1. Intended for educational and demonstration use.
License
cc-by-nc-4.0, following the non-commercial licence of the training dataset.
- Downloads last month
- 18