MagicSupport Intent Classifier (BERT Fine-Tuned)

Overview

This model is a fine-tuned bert-base-uncased model for multi-class intent classification in customer support environments.

It is optimized for:

  • Fast inference
  • High accuracy
  • Low deployment cost
  • Production-ready intent routing for support systems

The model is designed for the MagicSupport platform but is generalizable to structured customer support intent detection tasks.


Model Details

  • Base Model: bert-base-uncased
  • Architecture: BertForSequenceClassification
  • Task: Multi-class intent classification
  • Number of Intents: 28
  • Training Dataset: bitext/Bitext-customer-support-llm-chatbot-training-dataset
  • Loss: CrossEntropy with class weights
  • Framework: Hugging Face Transformers (PyTorch)

Performance

Validation Metrics (Epoch 5)

  • Accuracy: 0.9983
  • F1 Micro: 0.9983
  • F1 Macro: 0.9983
  • Validation Loss: 0.0087

The model demonstrates strong generalization and stable convergence across 5 epochs.


Example Predictions

Query Predicted Intent Confidence
I want to cancel my order cancel_order 0.999
How do I track my shipment delivery_options 0.997
I need a refund for my purchase get_refund 0.999
I forgot my password recover_password 0.999
I have a complaint about your service complaint 0.996
hello FALLBACK 0.999

The model also correctly identifies low-information inputs and maps them to a fallback intent.


Intended Use

This model is intended for:

  • Customer support intent classification
  • Chatbot routing
  • Support ticket categorization
  • Voice-to-intent pipelines (after STT)
  • Pre-routing before LLM or RAG systems

Typical production flow:

User Query โ†’ BERT Intent Classifier โ†’ Route to:

  • Knowledge Base Retrieval
  • Ticketing System
  • Escalation to Human
  • Fallback LLM

Example Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer from HuggingFace Hub
model_name = "learn-abc/magicSupport-intent-classifier"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Set device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
model.eval()

# Prediction function
def predict_intent(text, confidence_threshold=0.75):
    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=64)
    inputs = {k: v.to(device) for k, v in inputs.items()}
    
    with torch.no_grad():
        outputs = model(**inputs)
        logits = outputs.logits
        probs = torch.softmax(logits, dim=-1)
        confidence, prediction = torch.max(probs, dim=-1)
    
    predicted_intent = model.config.id2label[prediction.item()]
    confidence_score = confidence.item()
    
    # Apply confidence threshold
    if confidence_score < confidence_threshold:
        predicted_intent = "FALLBACK"
    
    return {
        "intent": predicted_intent,
        "confidence": confidence_score
    }

# Example usage
queries = [
    "I want to cancel my order",
    "How do I track my package",
    "I need a refund",
    "hello there"
]

for query in queries:
    result = predict_intent(query)
    print(f"Query: {query}")
    print(f"Intent: {result['intent']}")
    print(f"Confidence: {result['confidence']:.3f}\n")

Design Decisions

  • BERT selected over larger LLMs for:

    • Low latency
    • Cost efficiency
    • Predictable inference
    • Edge deployability
  • Class weighting applied to mitigate dataset imbalance.

  • High confidence outputs indicate strong separation between intent classes.


Known Limitations

  • Designed for structured customer support queries.

  • May struggle with:

    • Highly conversational multi-turn context
    • Extremely domain-specific enterprise terminology
    • Heavy slang or multilingual input
  • Not trained for open-domain conversation.


Future Improvements

  • Add MagicSupport real production data for domain adaptation.
  • Add hierarchical intent structure.
  • Introduce confidence threshold calibration.
  • Add OOD (Out-of-Distribution) detection.
  • Quantized inference version for edge deployment.

License

Specify your intended license here (e.g., MIT, Apache-2.0).


Citation

If using this model in research or production, please cite appropriately.


Model Card Author

For any inquiries or support, please reach out to:

Downloads last month
43
Safetensors
Model size
0.1B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support