TextThreat DistilBERT Dreaddit Stress Classifier

This model is part of TextThreat - AI-Powered Detection of Digital Well-Being Risks with Cybersecurity Analytics, an MSc thesis proof-of-concept by Abdul Muksith Rizvi at the University of Doha for Science and Technology.

TextThreat combines AI harm detection with cybersecurity analytics by converting model outputs into SIEM-ready threat events for Splunk/OpenSearch dashboards and SOAR-lite alerting.

Model Details

Model type: DistilBERT sequence classifier
Base model: distilbert-base-uncased
Task: Dreaddit binary stress classification
Labels: non_stress, stress
Problem type: single-label classification
Training method: LoRA/PEFT fine-tuning, exported as a merged full Hugging Face model
Repository: https://github.com/abdulmuksith3/textthreat-poc
Thesis role: stress signal model used alongside Jigsaw toxicity labels to support digital well-being risk scoring and synthetic co-occurrence analytics

Intended Use

This model is intended for the TextThreat proof-of-concept pipeline:

social media comment
-> stress probability
-> TextThreat digital_wellbeing event fields
-> Splunk/OpenSearch dashboard analytics
-> optional SOAR-lite alerting

It can be used in research demos where stress-related language is one signal in a broader harm and digital well-being analytics workflow.

Out-of-Scope Use

This model is not a clinical mental-health diagnostic model. It must not be used for diagnosis, emergency triage, autonomous intervention, or high-stakes decisions. Human review and appropriate support pathways are required for real-world safety workflows.

Evaluation

The current uploaded artifact corresponds to the thesis proof-of-concept training run. Metrics are stored in the repository under experiments/results/dreaddit_metrics.json.

Metric	Value
F1	0.7479
ROC-AUC	0.8237
PR-AUC	0.8259
Eval loss	0.6318

Limitations

Dreaddit labels are useful for stress-language research, but do not establish clinical state.
Short or ambiguous phrases may be poorly calibrated.
The TextThreat demo uses a configurable alert threshold and a transparent safety lexical overlay for explicit high-risk terms.
Model outputs should be interpreted as risk signals for dashboarding and analyst review, not final judgments.

Example

import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer

model_id = "abdulmuksith/textthreat-distilbert-dreaddit"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)

text = "I feel stressed and overwhelmed"
inputs = tokenizer(text, return_tensors="pt", truncation=True)
inputs = {k: v for k, v in inputs.items() if k in model.forward.__code__.co_varnames}

with torch.no_grad():
    logits = model(**inputs).logits[0]

scores = torch.softmax(logits, dim=0)
print({model.config.id2label[i]: float(scores[i]) for i in range(len(scores))})

Citation

If referencing this model, cite the thesis project:

Rizvi, A. M. (2026). TextThreat: AI-Powered Detection of Digital Well-Being Risks with Cybersecurity Analytics. MSc thesis, University of Doha for Science and Technology.

Downloads last month: 81

Safetensors

Model size

67M params

Tensor type

F32

Model tree for abdulmuksith/textthreat-distilbert-dreaddit

Base model

distilbert/distilbert-base-uncased

Finetuned

(11523)

this model