GoEmotions Chatbot Emotion Classifier

This repository contains a production export of a DeBERTa-v3-large emotion classifier trained with:

Head A: GoEmotions, 28 labels, used for production inference.
Head B: MELD, 7 labels, used as an auxiliary dialogue-context training task.

Only Head A is included in this export.

Intended Use

The model predicts user emotional signals for chatbot applications. Production inference returns the top 3 emotions with probabilities and prompt-ready wording for an LLM.

Production behavior merges three unreliable low-support GoEmotions classes:

grief -> sadness
pride -> admiration
relief -> joy

This gives a 25-label production label space while preserving the original raw 28-way logits for inspection.

Metrics

Held-out GoEmotions test set:

Weighted F1: 0.5285
Accuracy: 0.5338
Macro F1: 0.4143
ECE@15: 0.0595

Chatbot-domain validation set, 249 labelled examples, production merged 25-label space:

Weighted F1: 0.4568
Accuracy: 0.4739
Macro F1: 0.4586
ECE@15: 0.1757

The production-domain score passes the chosen deployment gate of weighted F1 >= 0.43.

Files

encoder/: DeBERTa-v3-large encoder weights and config.
tokenizer/: tokenizer files.
head_a.pt: production GoEmotions classification head.
metadata.json: training metadata, label list, and validation metadata.
inference.py: standalone local inference helper.

Local Inference

After downloading the repo:

python inference.py --model-dir . --text "I tried everything and I'm worried I broke the account."

For prompt-ready LLM formatting:

python inference.py --model-dir . --text "I tried everything and I'm worried I broke the account." --output-format llm

Expected output shape:

User emotional signals (top 3):
- Primary: fear (confidence: 0.XX)
- Secondary: confusion (confidence: 0.XX)
- Tertiary: sadness (confidence: 0.XX)
Interpretation: User likely feels fear.

Limitations

This is a custom PyTorch export, not a drop-in AutoModelForSequenceClassification checkpoint.
The model is less calibrated on chatbot-domain data than on the GoEmotions test split.
Weak domain classes include nervousness, neutral, annoyance, approval, and caring.
Probabilities should be treated as confidence bands, not exact probabilities.

Recommended confidence handling:

>= 0.50: high confidence.
0.35 - 0.50: medium confidence.
< 0.35: low confidence; treat emotional state as unclear.

Training Data

The model was trained on GoEmotions with MELD as an auxiliary dialogue-context task. It was then validated on a small in-domain chatbot example set.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for JamieYCR/goemotions-chatbot-emotion-classifier

Base model

microsoft/deberta-v3-large

Finetuned

(266)

this model