MentalBERT V5 Hierarchical V3 — 5-Stage Cascade

A 5-stage hierarchical mental-health text classifier covering 8 classes: Normal, Depression, Suicidal, Anxiety, Stress, Bipolar, Personality Disorder, Directed Aggression.

This is the Deep Dive tier of the Vibecheck project (Helwan University). The Quick Vibe counterpart is itsLu/mentalbert-v5-flat-8class.

Architecture

text → Stage 0  (Cardiff RoBERTa, DA binary)
           ↓
       Stage 1A (MentalBERT, Suicidal binary)
           ↓
       Stage 1B (MentalBERT, Normal vs Distress)
           ↓
       Stage 2  (MentalBERT, 5-class Anxiety/Bipolar/Dep/PD/Stress)
           ↓
       Stage 3  (Longformer, Depression vs Suicidal re-scorer)

Each stage is fine-tuned on the full V5 dataset, carved per stage. Routing thresholds are calibrated on the validation set via a joint t1A × t3 grid sweep producing two operating points:

BALANCED — maximises F1_macro − 0.30·Sui_miss_rate − 0.30·Sui_FP_rate.
SAFETY — constrains Sui→Dep ≤ 150 on val, then minimises Dep→Sui.

Usage (HF Inference Endpoint)

import requests
r = requests.post(ENDPOINT_URL,
    headers={"Authorization": f"Bearer {HF_TOKEN}"},
    json={"inputs": "I don't see the point anymore.", "mode": "safety"})
print(r.json())
# -> {"label": "Suicidal", "exit_stage": "stage1a", "mode": "safety",
#     "stage_probs": {...}}

mode is "balanced" (default) or "safety".

Files

stage0/ — Cardiff RoBERTa, DA gate
stage1a/ — MentalBERT, Suicidal gate
stage1b/ — MentalBERT, Normal/Distress
stage2/ — MentalBERT, 5-class distress
stage3/ — Longformer, Dep/Sui re-scorer
config.json — thresholds, class order, metrics
handler.py — HF Inference Endpoints handler with mode switching

Training data

mohamedasem318/mental-health-dataset-extended-v5 — 88,293 rows from 6 sources (cssrs, olid, kaggle_bpd, kaggle, huggingface, swmh). Stratified 70/10/20 split, random_state=42. Per-source reliability constants are recorded in config.json but were not applied to the loss (plain class-weighted CE only — a prior attempt with a per-sample-weighted wrapper destabilised Stage 0 training under fp16/AMP).

Citation

If you use this model, please cite the Vibecheck project (Helwan University final-year project, 2026).

Downloads last month: 398