jeremiasdavison's picture
Update README.md
95cc972 verified
metadata
language: en
license: mit
tags:
  - text-classification
  - it-support
  - customer-service
  - distilbert
datasets:
  - Tobi-Bueck/customer-support-tickets
metrics:
  - accuracy
  - f1
model-index:
  - name: it-support-ticket-classifier
    results:
      - task:
          type: text-classification
          name: IT Support Ticket Classification
        metrics:
          - type: accuracy
            value: 0.89
            name: Accuracy

🎫 IT Support Ticket Classifier

Fine-tuned DistilBERT model for automatic classification of IT support tickets into predefined categories.

🎯 Model Description

This model classifies IT support tickets into different categories to help route them to the appropriate support team. It's based on DistilBERT (distilbert-base-uncased), a lighter and faster version of BERT, making it ideal for production environments.

Key Features:

  • ⚑ Fast inference (DistilBERT architecture)
  • 🎯 89% accuracy on test set
  • πŸ”§ Ready for production deployment
  • πŸ“¦ Easy integration with Transformers pipeline

πŸ“Š Performance

Metric Score
Accuracy 89%
F1-Score 0.88
Precision 0.87
Recall 0.89

πŸš€ Quick Start

Installation

pip install transformers torch

Usage

from transformers import pipeline

# Load the classifier
classifier = pipeline(
    "text-classification",
    model="jeremiasdavison/it-support-ticket-classifier"
)

# Classify a ticket
ticket = "My laptop won't connect to the office WiFi network"
result = classifier(ticket)

print(result)
# Output: [{'label': 'NETWORK', 'score': 0.95}]

Get All Class Probabilities

classifier = pipeline(
    "text-classification",
    model="jeremiasdavison/it-support-ticket-classifier",
    return_all_scores=True
)

ticket = "I forgot my password and can't log into the system"
results = classifier(ticket)[0]

for result in results:
    print(f"{result['label']}: {result['score']:.2%}")

🏷️ Categories

The model classifies tickets into the following categories:

  • Hardware - Physical device issues (laptop, printer, monitor, etc.)
  • Software - Application bugs, software errors, installation problems
  • Network - WiFi, VPN, connectivity issues
  • Account - Login problems, password resets, permissions
  • General - General inquiries, documentation requests

πŸ“š Training Data

The model was fine-tuned on the Tobi-Bueck/customer-support-tickets dataset:

  • Total examples: ~61,800 tickets
  • Filtered for: English language only
  • Train/Val/Test split: 80/10/10
  • Features: Subject + Body combined as input text

πŸ› οΈ Training Details

Hyperparameters

  • Base model: distilbert-base-uncased
  • Learning rate: 2e-5
  • Batch size: 16
  • Epochs: 3
  • Optimizer: AdamW
  • Max sequence length: 128 tokens

Framework

  • Transformers 4.x
  • PyTorch
  • Trained on Google Colab (T4 GPU)

🌐 Live Demo

Try the model in action: HuggingFace Space (coming soon)

πŸ’‘ Use Cases

  • Automated ticket routing - Direct tickets to the right support team
  • Priority detection - Identify urgent issues automatically
  • Analytics - Understand ticket distribution by category
  • Chatbot integration - Pre-classify user issues in conversational interfaces

⚠️ Limitations

  • Trained primarily on synthetic IT support data
  • Best performance on English-language tickets
  • May require fine-tuning for domain-specific terminology
  • Performance may vary on tickets with multiple issues

🀝 Contributing

Feedback and contributions are welcome! If you encounter issues or have suggestions:

πŸ“„ License

MIT License - feel free to use this model in your projects.

πŸ™ Acknowledgments

πŸ“« Contact

Jeremias Davison


This model was created as a portfolio project to demonstrate end-to-end ML workflow: from dataset selection to model training and deployment.