MentalBERT V5 Hierarchical V3 โ€” 5-Stage Cascade

A 5-stage hierarchical mental-health text classifier covering 8 classes: Normal, Depression, Suicidal, Anxiety, Stress, Bipolar, Personality Disorder, Directed Aggression.

This is the Deep Dive tier of the Vibecheck project (Helwan University). The Quick Vibe counterpart is itsLu/mentalbert-v5-flat-8class.

Architecture

text โ†’ Stage 0  (Cardiff RoBERTa, DA binary)
           โ†“
       Stage 1A (MentalBERT, Suicidal binary)
           โ†“
       Stage 1B (MentalBERT, Normal vs Distress)
           โ†“
       Stage 2  (MentalBERT, 5-class Anxiety/Bipolar/Dep/PD/Stress)
           โ†“
       Stage 3  (Longformer, Depression vs Suicidal re-scorer)

Each stage is fine-tuned on the full V5 dataset, carved per stage. Routing thresholds are calibrated on the validation set via a joint t1A ร— t3 grid sweep producing two operating points:

  • BALANCED โ€” maximises F1_macro โˆ’ 0.30ยทSui_miss_rate โˆ’ 0.30ยทSui_FP_rate.
  • SAFETY โ€” constrains Suiโ†’Dep โ‰ค 150 on val, then minimises Depโ†’Sui.

Usage (HF Inference Endpoint)

import requests
r = requests.post(ENDPOINT_URL,
    headers={"Authorization": f"Bearer {HF_TOKEN}"},
    json={"inputs": "I don't see the point anymore.", "mode": "safety"})
print(r.json())
# -> {"label": "Suicidal", "exit_stage": "stage1a", "mode": "safety",
#     "stage_probs": {...}}

mode is "balanced" (default) or "safety".

Files

  • stage0/ โ€” Cardiff RoBERTa, DA gate
  • stage1a/ โ€” MentalBERT, Suicidal gate
  • stage1b/ โ€” MentalBERT, Normal/Distress
  • stage2/ โ€” MentalBERT, 5-class distress
  • stage3/ โ€” Longformer, Dep/Sui re-scorer
  • config.json โ€” thresholds, class order, metrics
  • handler.py โ€” HF Inference Endpoints handler with mode switching

Training data

mohamedasem318/mental-health-dataset-extended-v5 โ€” 88,293 rows from 6 sources (cssrs, olid, kaggle_bpd, kaggle, huggingface, swmh). Stratified 70/10/20 split, random_state=42. Per-source reliability constants are recorded in config.json but were not applied to the loss (plain class-weighted CE only โ€” a prior attempt with a per-sample-weighted wrapper destabilised Stage 0 training under fp16/AMP).

Citation

If you use this model, please cite the Vibecheck project (Helwan University final-year project, 2026).

Downloads last month
398
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support