bitext/Bitext-customer-support-llm-chatbot-training-dataset
Viewer • Updated • 26.9k • 6.97k • 172
Fine-tuned distilbert/distilbert-base-uncased for classifying customer support tickets into 27 intent categories.
This model classifies raw customer support ticket text into one of 27 issue-type intents, enabling automated routing, prioritisation, and analytics for customer support pipelines.
| ID | Intent | ID | Intent |
|---|---|---|---|
| 0 | cancel_order | 14 | edit_account |
| 1 | change_order | 15 | get_invoice |
| 2 | change_shipping_address | 16 | get_refund |
| 3 | check_cancellation_fee | 17 | newsletter_subscription |
| 4 | check_invoice | 18 | payment_issue |
| 5 | check_payment_methods | 19 | place_order |
| 6 | check_refund_policy | 20 | recover_password |
| 7 | complaint | 21 | registration_problems |
| 8 | contact_customer_service | 22 | review |
| 9 | contact_human_agent | 23 | set_up_shipping_address |
| 10 | create_account | 24 | switch_account |
| 11 | delete_account | 25 | track_order |
| 12 | delivery_options | 26 | track_refund |
| 13 | delivery_period |
| Hyperparameter | Value |
|---|---|
| Base model | distilbert/distilbert-base-uncased |
| Epochs | 3 |
| Batch size (per device) | 32 |
| Learning rate | 2e-5 |
| Weight decay | 0.01 |
| Warmup ratio | 0.1 |
| Max sequence length | 128 tokens |
| Best model selected by | Macro F1 |
| Optimizer | AdamW |
| Precision | fp16 |
⏳ Model training pending. Accuracy and Macro F1 will be filled in after training completes.
Evaluated on a held-out test set (15% of data, ~4,031 samples):
| Metric | Value |
|---|---|
| Accuracy | (pending) |
| Macro F1 | (pending) |
See confusion_matrix.png for the full per-class breakdown (added after training).
from transformers import pipeline
classifier = pipeline(
"text-classification",
model="annebanne/distilbert-support-classifier",
)
# Single ticket
result = classifier("I need to cancel my order, it hasn't shipped yet.")
print(result)
# [{'label': 'cancel_order', 'score': 0.98}]
# Batch of tickets
tickets = [
"Where is my refund? It's been 2 weeks.",
"I can't log into my account after resetting my password.",
"Please send me an invoice for order #12345.",
"My payment keeps getting declined.",
]
results = classifier(tickets)
for ticket, res in zip(tickets, results):
print(f"{res['label']:35s} ({res['score']:.2%}) — {ticket[:60]}")
If you use this model, please cite the base model:
@article{sanh2019distilbert,
title={DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter},
author={Sanh, Victor and Debut, Lysandre and Chaumond, Julien and Wolf, Thomas},
journal={arXiv preprint arXiv:1910.01108},
year={2019}
}
This model repository was generated by ML Intern, an agent for machine learning research and development on the Hugging Face Hub.
Base model
distilbert/distilbert-base-uncased