TextThreat DistilBERT Dreaddit Stress Classifier

This model is part of TextThreat - AI-Powered Detection of Digital Well-Being Risks with Cybersecurity Analytics, an MSc thesis proof-of-concept by Abdul Muksith Rizvi at the University of Doha for Science and Technology.

TextThreat combines AI harm detection with cybersecurity analytics by converting model outputs into SIEM-ready threat events for Splunk/OpenSearch dashboards and SOAR-lite alerting.

Model Details

  • Model type: DistilBERT sequence classifier
  • Base model: distilbert-base-uncased
  • Task: Dreaddit binary stress classification
  • Labels: non_stress, stress
  • Problem type: single-label classification
  • Training method: LoRA/PEFT fine-tuning, exported as a merged full Hugging Face model
  • Repository: https://github.com/abdulmuksith3/textthreat-poc
  • Thesis role: stress signal model used alongside Jigsaw toxicity labels to support digital well-being risk scoring and synthetic co-occurrence analytics

Intended Use

This model is intended for the TextThreat proof-of-concept pipeline:

social media comment
-> stress probability
-> TextThreat digital_wellbeing event fields
-> Splunk/OpenSearch dashboard analytics
-> optional SOAR-lite alerting

It can be used in research demos where stress-related language is one signal in a broader harm and digital well-being analytics workflow.

Out-of-Scope Use

This model is not a clinical mental-health diagnostic model. It must not be used for diagnosis, emergency triage, autonomous intervention, or high-stakes decisions. Human review and appropriate support pathways are required for real-world safety workflows.

Evaluation

The current uploaded artifact corresponds to the thesis proof-of-concept training run. Metrics are stored in the repository under experiments/results/dreaddit_metrics.json.

Metric Value
F1 0.7479
ROC-AUC 0.8237
PR-AUC 0.8259
Eval loss 0.6318

Limitations

  • Dreaddit labels are useful for stress-language research, but do not establish clinical state.
  • Short or ambiguous phrases may be poorly calibrated.
  • The TextThreat demo uses a configurable alert threshold and a transparent safety lexical overlay for explicit high-risk terms.
  • Model outputs should be interpreted as risk signals for dashboarding and analyst review, not final judgments.

Example

import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer

model_id = "abdulmuksith/textthreat-distilbert-dreaddit"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)

text = "I feel stressed and overwhelmed"
inputs = tokenizer(text, return_tensors="pt", truncation=True)
inputs = {k: v for k, v in inputs.items() if k in model.forward.__code__.co_varnames}

with torch.no_grad():
    logits = model(**inputs).logits[0]

scores = torch.softmax(logits, dim=0)
print({model.config.id2label[i]: float(scores[i]) for i in range(len(scores))})

Citation

If referencing this model, cite the thesis project:

Rizvi, A. M. (2026). TextThreat: AI-Powered Detection of Digital Well-Being Risks with Cybersecurity Analytics. MSc thesis, University of Doha for Science and Technology.
Downloads last month
81
Safetensors
Model size
67M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for abdulmuksith/textthreat-distilbert-dreaddit

Finetuned
(11523)
this model