Chat Message Tagger (DeBERTa v3)

A multi-label text classifier that assigns 15 semantic labels to group chat messages. Built for enriching chat data with structured content signals — useful for community analytics, newsletter curation, discussion ranking, and moderation tooling.

What It Does

Given a chat message, the model outputs 15 independent sigmoid scores (0.0 to 1.0), each indicating the probability that the message belongs to that category. Multiple labels can be active simultaneously.

Example:

"Has anyone tried fine-tuning DeBERTa for multi-label? I got NaN gradients in fp16."

Label	Score	Active
professional	0.94	Yes
question	0.91	Yes
experience_sharing	0.72	Yes
substantive	0.88	Yes
how_to	0.15	No
...	...	...

Labels

#	Label	Definition	Test F1	AUC-ROC
1	`professional`	Domain-relevant professional substance	0.839	0.882
2	`question`	Asks for information, help, or advice	0.889	0.977
3	`experience_sharing`	First-hand account of trying or building something	0.631	0.844
4	`resource`	Shares a link, tool, paper, or tutorial	0.746	0.928
5	`opinion`	Subjective take, prediction, or stance	0.626	0.853
6	`how_to`	Concrete tip, solution, or workaround	0.530	0.844
7	`humor`	Joke, meme, sarcasm, playful remark	0.347	0.792
8	`announcement`	Community event, meetup, release, group news	0.723	0.969
9	`off_group_topic`	Content unrelated to community's purpose	0.258	0.753
10	`reaction`	Acknowledgment, agreement, thanks, emoji-only	0.725	0.950
11	`substantive`	High information density, summary-worthy	0.787	0.963
12	`discussion_init`	Initiates a new topic or conversation	0.706	0.870
13	`emotional`	Highly emotional tone (excitement, frustration)	0.437	0.852
14	`disagreement`	Adversarial or confrontational disagreement	0.123	0.889
15	`positive_reinforcement`	Encouragement or gratitude	0.603	0.873

Quick Start

import json
import torch
import numpy as np
from transformers import AutoTokenizer, AutoModelForSequenceClassification

model_name = "eladlaor/chat-message-tagger-deberta-v3"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
model.eval()

# Load per-label thresholds (optimized on validation set)
from huggingface_hub import hf_hub_download
thresholds_path = hf_hub_download(model_name, "thresholds.json")
with open(thresholds_path) as f:
    thresholds = json.load(f)

# Label names in model output order
label_names = [
    "professional", "question", "experience_sharing", "resource", "opinion",
    "how_to", "humor", "announcement", "off_group_topic", "reaction",
    "substantive", "discussion_init", "emotional", "disagreement",
    "positive_reinforcement",
]

# Predict
text = "Has anyone tried fine-tuning DeBERTa? I got NaN gradients in fp16."
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=256)

with torch.no_grad():
    logits = model(**inputs).logits[0]
    scores = torch.sigmoid(logits).numpy()

# Apply per-label thresholds
for name, score in zip(label_names, scores):
    threshold = thresholds.get(name, 0.5)
    active = "<<" if score > threshold else ""
    print(f"  {name:30s}  {score:.3f}  {active}")

Training

Base model: microsoft/deberta-v3-base (184M params)
Method: 2-phase fine-tuning: (1) head-only with frozen backbone (LR=1e-3), then (2) LoRA adaptation (rank=16, LR=2e-5)
Loss: BCEWithLogitsLoss with per-label pos_weight for rare labels
Precision: fp32 (DeBERTa v3 produces NaN in fp16/bf16)
Data: 3,842 training samples from professional tech community group chats (Hebrew translated to English)
Dataset: eladlaor/chat-message-multilabel

Evaluation

Metric	Value
Mean F1 (macro)	0.598
Hamming Loss	0.123
Subset Accuracy	0.193
Mean AUC-ROC	0.883
Mean ECE	0.136

Included Files

File	Description
`model.safetensors`	Merged model weights (LoRA baked in)
`config.json`	Model config with `problem_type=multi_label_classification`
`tokenizer.json`	Tokenizer with 7 special tokens ([URL], [MENTION], etc.)
`thresholds.json`	Per-label classification thresholds (optimized on validation set)
`label_taxonomy.json`	Label definitions and examples for downstream consumers
`evaluation.json`	Full per-label metrics on test set

Intended Use

Community analytics: Tag messages in Slack, Discord, Telegram, WhatsApp for content analysis
Newsletter curation: Identify high-value discussions for automated summaries
Moderation support: Flag off-topic content or heated disagreements
Discussion ranking: Enrich messages with semantic signals before LLM-based ranking

Limitations

English only. Trained on Hebrew messages translated to English. Non-English input will produce unreliable scores.
Tech community bias. Trained on professional tech community chats. Labels like "professional" and "how_to" are calibrated for tech discussions. May need fine-tuning for non-tech domains.
Weak labels: disagreement (F1=0.12), off_group_topic (F1=0.26), and humor (F1=0.35) have low F1 due to limited training data. Their AUC-ROC is strong (>0.75), so they rank correctly but thresholding is unreliable.
Not a content moderation tool. This model classifies content type, not toxicity or safety. Do not use it for content moderation decisions.

License

Apache 2.0

Citation

@misc{chat-message-tagger-2026,
  author = {Elad Laor},
  title = {Chat Message Tagger: Multi-Label Classification for Group Chat Messages},
  year = {2026},
  publisher = {HuggingFace},
  url = {https://huggingface.co/eladlaor/chat-message-tagger-deberta-v3}
}

Downloads last month: 3

Safetensors

Model size

0.2B params

Tensor type

F16

Model tree for eladlaor/chat-message-tagger-deberta-v3

Base model

microsoft/deberta-v3-base

Finetuned

(607)

this model

Dataset used to train eladlaor/chat-message-tagger-deberta-v3

Evaluation results

Mean F1 (macro) on chat-message-multilabel
test set self-reported

0.598
Hamming Loss on chat-message-multilabel
test set self-reported

0.123
Mean AUC-ROC on chat-message-multilabel
test set self-reported

0.883