Update README.md

95cc972 verified 24 days ago

4.7 kB

language: en
license: mit
tags:
  - text-classification
  - it-support
  - customer-service
  - distilbert
datasets:
  - Tobi-Bueck/customer-support-tickets
metrics:
  - accuracy
  - f1
model-index:
  - name: it-support-ticket-classifier
    results:
      - task:
          type: text-classification
          name: IT Support Ticket Classification
        metrics:
          - type: accuracy
            value: 0.89
            name: Accuracy

🎫 IT Support Ticket Classifier

Fine-tuned DistilBERT model for automatic classification of IT support tickets into predefined categories.

🎯 Model Description

This model classifies IT support tickets into different categories to help route them to the appropriate support team. It's based on DistilBERT (distilbert-base-uncased), a lighter and faster version of BERT, making it ideal for production environments.

Key Features:

⚡ Fast inference (DistilBERT architecture)
🎯 89% accuracy on test set
🔧 Ready for production deployment
📦 Easy integration with Transformers pipeline

📊 Performance

Metric	Score
Accuracy	89%
F1-Score	0.88
Precision	0.87
Recall	0.89

🚀 Quick Start

Installation

pip install transformers torch

Usage

from transformers import pipeline

# Load the classifier
classifier = pipeline(
    "text-classification",
    model="jeremiasdavison/it-support-ticket-classifier"
)

# Classify a ticket
ticket = "My laptop won't connect to the office WiFi network"
result = classifier(ticket)

print(result)
# Output: [{'label': 'NETWORK', 'score': 0.95}]

Get All Class Probabilities

classifier = pipeline(
    "text-classification",
    model="jeremiasdavison/it-support-ticket-classifier",
    return_all_scores=True
)

ticket = "I forgot my password and can't log into the system"
results = classifier(ticket)[0]

for result in results:
    print(f"{result['label']}: {result['score']:.2%}")

🏷️ Categories

The model classifies tickets into the following categories:

Hardware - Physical device issues (laptop, printer, monitor, etc.)
Software - Application bugs, software errors, installation problems
Network - WiFi, VPN, connectivity issues
Account - Login problems, password resets, permissions
General - General inquiries, documentation requests

📚 Training Data

The model was fine-tuned on the Tobi-Bueck/customer-support-tickets dataset:

Total examples: ~61,800 tickets
Filtered for: English language only
Train/Val/Test split: 80/10/10
Features: Subject + Body combined as input text

🛠️ Training Details

Hyperparameters

Base model: distilbert-base-uncased
Learning rate: 2e-5
Batch size: 16
Epochs: 3
Optimizer: AdamW
Max sequence length: 128 tokens

Framework

Transformers 4.x
PyTorch
Trained on Google Colab (T4 GPU)

🌐 Live Demo

Try the model in action: HuggingFace Space (coming soon)

💡 Use Cases

Automated ticket routing - Direct tickets to the right support team
Priority detection - Identify urgent issues automatically
Analytics - Understand ticket distribution by category
Chatbot integration - Pre-classify user issues in conversational interfaces

⚠️ Limitations

Trained primarily on synthetic IT support data
Best performance on English-language tickets
May require fine-tuning for domain-specific terminology
Performance may vary on tickets with multiple issues

🤝 Contributing

Feedback and contributions are welcome! If you encounter issues or have suggestions:

Open an issue on the model repository
Reach out via the community tab

📄 License

MIT License - feel free to use this model in your projects.

🙏 Acknowledgments

Dataset: Tobi-Bueck/customer-support-tickets
Base model: DistilBERT by Hugging Face
Built with Transformers

📫 Contact

Jeremias Davison

HuggingFace: @jeremiasdavison
LinkedIn: linkedin.com/in/jeremiasdavison

This model was created as a portfolio project to demonstrate end-to-end ML workflow: from dataset selection to model training and deployment.