Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -1,3 +1,103 @@
|
|
| 1 |
-
---
|
| 2 |
-
|
| 3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language: en
|
| 3 |
+
license: apache-2.0
|
| 4 |
+
tags:
|
| 5 |
+
- text-classification
|
| 6 |
+
- distilbert
|
| 7 |
+
- ticket-classification
|
| 8 |
+
- customer-support
|
| 9 |
+
- it-support
|
| 10 |
+
datasets:
|
| 11 |
+
- custom
|
| 12 |
+
metrics:
|
| 13 |
+
- accuracy
|
| 14 |
+
- f1
|
| 15 |
+
widget:
|
| 16 |
+
- text: "I can't log into my account, password reset not working"
|
| 17 |
+
- text: "System is very slow, taking forever to load pages"
|
| 18 |
+
- text: "How do I integrate with Salesforce API?"
|
| 19 |
+
---
|
| 20 |
+
|
| 21 |
+
# TicketCat: IT Support Ticket Classification Model
|
| 22 |
+
|
| 23 |
+
This is a fine-tuned DistilBERT model for classifying IT support tickets into 9 categories.
|
| 24 |
+
|
| 25 |
+
## Model Description
|
| 26 |
+
|
| 27 |
+
This model was fine-tuned on customer support tickets to automatically categorize incoming IT support requests. It uses DistilBERT as the base model and was trained to classify tickets into 9 distinct categories.
|
| 28 |
+
|
| 29 |
+
## Intended Use
|
| 30 |
+
|
| 31 |
+
This model is designed to automatically categorize IT support tickets to help route them to the appropriate support team or department.
|
| 32 |
+
|
| 33 |
+
## Categories
|
| 34 |
+
|
| 35 |
+
The model classifies tickets into the following categories:
|
| 36 |
+
|
| 37 |
+
- Account Access / Login Issues
|
| 38 |
+
- Billing & Payments
|
| 39 |
+
- Bug / Defect Reports
|
| 40 |
+
- Feature Requests
|
| 41 |
+
- General Inquiries / Other
|
| 42 |
+
- How-To / Product Usage Questions
|
| 43 |
+
- Integration Issues
|
| 44 |
+
- Performance Problems
|
| 45 |
+
- Security & Compliance
|
| 46 |
+
|
| 47 |
+
## Usage
|
| 48 |
+
|
| 49 |
+
```python
|
| 50 |
+
from transformers import DistilBertForSequenceClassification, DistilBertTokenizer
|
| 51 |
+
import torch
|
| 52 |
+
|
| 53 |
+
# Load model and tokenizer
|
| 54 |
+
model_name = "YOUR_HF_USERNAME/ticketcat-distilbert" # Replace with your actual model name
|
| 55 |
+
tokenizer = DistilBertTokenizer.from_pretrained(model_name)
|
| 56 |
+
model = DistilBertForSequenceClassification.from_pretrained(model_name)
|
| 57 |
+
|
| 58 |
+
# Classify a ticket
|
| 59 |
+
ticket_text = "I can't access my account, password reset link is not working"
|
| 60 |
+
inputs = tokenizer(ticket_text, return_tensors="pt", padding=True, truncation=True, max_length=128)
|
| 61 |
+
|
| 62 |
+
with torch.no_grad():
|
| 63 |
+
outputs = model(**inputs)
|
| 64 |
+
probs = torch.softmax(outputs.logits, dim=1)
|
| 65 |
+
predicted_class = torch.argmax(probs, dim=1).item()
|
| 66 |
+
confidence = probs.max().item()
|
| 67 |
+
|
| 68 |
+
# Map to category name
|
| 69 |
+
categories = {'0': 'Account Access / Login Issues', '1': 'Billing & Payments', '2': 'Bug / Defect Reports', '3': 'Feature Requests', '4': 'General Inquiries / Other', '5': 'How-To / Product Usage Questions', '6': 'Integration Issues', '7': 'Performance Problems', '8': 'Security & Compliance'}
|
| 70 |
+
predicted_category = categories[str(predicted_class)]
|
| 71 |
+
|
| 72 |
+
print(f"Category: {predicted_category}")
|
| 73 |
+
print(f"Confidence: {confidence:.4f}")
|
| 74 |
+
```
|
| 75 |
+
|
| 76 |
+
## Training Details
|
| 77 |
+
|
| 78 |
+
- **Base Model**: distilbert-base-uncased
|
| 79 |
+
- **Framework**: Hugging Face Transformers
|
| 80 |
+
- **Task**: Multi-class text classification
|
| 81 |
+
- **Number of Classes**: 9
|
| 82 |
+
- **Max Sequence Length**: 128 tokens
|
| 83 |
+
- **Training Approach**: Fine-tuning with class weights for imbalanced data
|
| 84 |
+
|
| 85 |
+
## Limitations
|
| 86 |
+
|
| 87 |
+
- The model was trained on IT/customer support tickets and may not perform well on other domains
|
| 88 |
+
- Performance may vary on tickets that don't fit clearly into one category
|
| 89 |
+
- Low confidence predictions (< 0.6) may need human review
|
| 90 |
+
|
| 91 |
+
## Citation
|
| 92 |
+
|
| 93 |
+
If you use this model, please cite:
|
| 94 |
+
|
| 95 |
+
```
|
| 96 |
+
@misc{ticketcat2024,
|
| 97 |
+
author = {TicketCat Team},
|
| 98 |
+
title = {TicketCat: IT Support Ticket Classification},
|
| 99 |
+
year = {2024},
|
| 100 |
+
publisher = {Hugging Face},
|
| 101 |
+
howpublished = {\url{https://huggingface.co/YOUR_USERNAME/ticketcat-distilbert}}
|
| 102 |
+
}
|
| 103 |
+
```
|