mmBERT Feedback Detector

A high-performance multilingual 4-class feedback classification model fine-tuned on mmBERT-base using AMD MI300X GPU.

Model Description

This model classifies user feedback into 4 categories:

Label ID Description F1 Score
SAT 0 User is satisfied 100.0%
NEED_CLARIFICATION 1 User needs more information 99.7%
WRONG_ANSWER 2 System gave incorrect response 96.2%
WANT_DIFFERENT 3 User wants something different 95.9%

Performance

Metric Value
Accuracy 98.63%
F1 Macro 97.94%
F1 Weighted 98.62%

Training Data

  • Dataset: llm-semantic-router/feedback-detector-dataset
  • Size: 51,694 examples (46,524 train / 5,170 validation)
  • Languages: English, Japanese, Turkish
  • Labeling: GPT-OSS-120B via vLLM on AMD MI300X
  • Sources: MultiWOZ, SGD, INSCIT, MIMICS, Hazumi, Consumer Complaints

Training Configuration

Parameter Value
Base Model jhu-clsp/mmBERT-base
Epochs 3
Batch Size 64
Learning Rate 2e-5
Max Length 512
Optimizer AdamW

Hardware

Component Specification
GPU AMD Instinct MI300X
VRAM 192 GB HBM3
Framework PyTorch with ROCm
Training Time ~2 minutes

Usage

Quick Start

from transformers import pipeline

classifier = pipeline("text-classification", model="llm-semantic-router/mmbert-feedback-detector")

result = classifier("Thank you, that was exactly what I needed!")
print(result)  # [{'label': 'SAT', 'score': 0.99}]

Full Example

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

model = AutoModelForSequenceClassification.from_pretrained("llm-semantic-router/mmbert-feedback-detector")
tokenizer = AutoTokenizer.from_pretrained("llm-semantic-router/mmbert-feedback-detector")

labels = ["SAT", "NEED_CLARIFICATION", "WRONG_ANSWER", "WANT_DIFFERENT"]

def classify(text):
    inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
    with torch.no_grad():
        outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=-1)
    pred = probs.argmax(-1).item()
    return labels[pred], probs[0][pred].item()

# Test
label, confidence = classify("Thank you, that was helpful!")
print(f"Label: {label}, Confidence: {confidence:.2%}")

Multilingual Examples

# English - Satisfied
classify("Thanks, that's exactly what I needed!")
# => ('SAT', 0.99)

# English - Need clarification
classify("Can you explain that in more detail?")
# => ('NEED_CLARIFICATION', 0.97)

# English - Wrong answer
classify("That's incorrect, the information you gave me was wrong.")
# => ('WRONG_ANSWER', 0.95)

# English - Want different
classify("Can you show me other options instead?")
# => ('WANT_DIFFERENT', 0.94)

# Japanese - Need clarification
classify("もう少し詳しく教えてください")
# => ('NEED_CLARIFICATION', 0.96)

# Turkish - Wrong answer  
classify("Bu yanlış bilgi, düzeltin lütfen")
# => ('WRONG_ANSWER', 0.93)

# German (zero-shot)
classify("Können Sie mir eine andere Option zeigen?")
# => ('WANT_DIFFERENT', 0.89)

# Spanish (zero-shot)
classify("Gracias, eso es exactamente lo que necesitaba!")
# => ('SAT', 0.95)

Use Cases

  • Chatbot feedback analysis: Detect user satisfaction in real-time
  • Customer service: Route dissatisfied users to human agents
  • Dialogue systems: Adapt responses based on user feedback
  • Quality monitoring: Track satisfaction metrics across conversations

Limitations

  • Best performance on conversational/dialogue text
  • May have reduced accuracy on very short inputs (<5 words)
  • Cross-lingual transfer works best for Romance and Germanic languages

Related Models

Citation

@model{mmbert_feedback_detector,
  title={mmBERT Feedback Detector},
  author={LLM Semantic Router Team},
  year={2025},
  url={https://huggingface.co/llm-semantic-router/mmbert-feedback-detector}
}

License

Apache 2.0

Downloads last month
26
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for llm-semantic-router/mmbert-feedback-detector

Finetuned
(54)
this model

Dataset used to train llm-semantic-router/mmbert-feedback-detector

Evaluation results