Early Depression Detection using Longformer and Data Augmentation
This is a fine-tuned version of AIMH/mental-longformer-base-4096 for detecting linguistic markers of depression risk based on a user's entire posting history. This model is the primary artifact of the research project, "Early Depression Detection and Correlational Analysis on eRisk by Longformer and Data Augmentation."
Project Summary
This model was developed as part of a Master's research project to address the challenges of early depression detection from noisy and imbalanced social media data. The methodology involved:
- Fine-tuning a domain-specific
Mental-Longformermodel, chosen for its ability to handle long user histories (up to 4096 tokens). - Implementing an advanced data augmentation strategy using Gemini 2.5 Flash Lite to mitigate severe class imbalance.
- Conducting a comprehensive correlational analysis to uncover behavioral, social, and linguistic patterns of depression online.
On the final held-out eRisk 2025 test set, this model achieved an F1-score of 0.77 for the depressed class, demonstrating robust generalization.
Training Procedure
Base Model
This model was fine-tuned from AIMH/mental-longformer-base-4096, a Longformer model pre-trained on a large corpus of text from online mental health forums, making it highly specialized for this domain.
Training Data
The model was fine-tuned on user-level data from the eRisk dataset (CLEF 2017, 2018, and 2022). Due to the sensitive nature and licensing of this data, it cannot be redistributed. Please refer to the official CLEF eRisk workshops for information on data access.
Data Augmentation Strategy
To address the critical challenges of data scarcity and class imbalance, a multi-pronged data augmentation strategy was employed for the depressed (minority) class, powered by Gemini 2.5 Flash Lite:
- Translation: Non-English posts from depressed users were translated into English to increase data volume.
- Paraphrasing: Gemini was prompted to generate multiple, contextually relevant rephrased versions of existing depressed posts, increasing linguistic diversity.
- Quality Control: Augmented samples were rigorously filtered based on semantic similarity and sentiment consistency to ensure high fidelity and prevent the introduction of noise.
This augmentation strategy proved highly effective, enabling the Longformer model to learn more robust patterns from an expanded minority class.
Performance
The model's performance was evaluated in two stages: through 5-fold cross-validation during training, and on a final, held-out test set (eRisk 2025).
Final Test Set Performance (eRisk 2025)
This is the primary result, showing the performance of the single best model on completely unseen data.
| Class | Precision | Recall | F1-Score | Support |
|---|---|---|---|---|
| non-depressed (0) | 0.9658 | 0.9789 | 0.9723 | 807 |
| depressed (1) | 0.8132 | 0.7255 | 0.7668 | 102 |
| Accuracy | 0.9505 | 909 | ||
| Weighted Avg | 0.9486 | 0.9505 | 0.9493 | 909 |
Training & Validation Stability (5-Fold Cross-Validation)
To ensure the model is robust, it was trained using 5-fold cross-validation on the combined 2017-2022 eRisk datasets. The average performance across the 5 validation folds demonstrates the model's stability.
- Mean F1-Score across 5 Folds: 0.8623
- Standard Deviation of F1-Score: 0.0093
The low standard deviation indicates that the model performs consistently across different subsets of the training data. The model uploaded here is the best-performing single model from Fold 1 of this process.
How to Use
You can use this model with a text-classification pipeline.
from transformers import pipeline
# Load the model from the Hub
pipe = pipeline("text-classification", model="avtak/erisk-longformer-depression-v1")
# The model works best on longer texts that represent a collection of posts
user_posts = """
I've been feeling really down lately. Nothing seems fun anymore...
I tried playing my favorite game but I just couldn't get into it.
Sleep is my only escape but I wake up feeling just as tired.
"""
result = pipe(user_posts)
print(result)
# [{'label': 'LABEL_1', 'score': 0.85}] -> Example output where LABEL_1 is the "depressed" class
Ethical Considerations and Limitations
- Not a Diagnostic Tool: This model is NOT a medical diagnostic tool and should not be used as such. It only identifies statistical patterns in language that are correlated with a depression label in a specific dataset. Please consult a qualified healthcare professional for any mental health concerns.
- High Risk of Misuse: Using this model to automatically label or judge individuals online is a misuse of the technology. It should only be used for research purposes under ethical guidelines.
- Bias in Data: The training data is from Reddit, a platform with a specific demographic user base. The model may not generalize well to other platforms, cultures, or demographic groups. The linguistic expression of mental distress varies greatly.
- Correlation, not Causation: The model identifies linguistic patterns correlated with depression, not the causes of depression.
Author and Contact
This model was developed by Hassan Hassanzadeh Aliabadi as part of a Master in Data Science degree at Universiti Malaya.
- LinkedIn: https://www.linkedin.com/in/hassanzh/
- Hugging Face: https://huggingface.co/avtak
- Google Scholar: https://scholar.google.com/citations?hl=en&user=7sU9U1QAAAAJ
For questions about this model, please open a discussion on the Hugging Face community tab.
Citation
If you use this model in your research, please consider citing it:
@misc{hassanzadeh_aliabadi_erisk_2025,
author = {Hassan Hassanzadeh Aliabadi},
title = {Early Depression Detection and Correlational Analysis on eRisk by Longformer and Data Augmentation},
year = {2025},
publisher = {Hugging Face},
journal = {Hugging Face repository},
howpublished = {\url{https://huggingface.co/avtak/erisk-longformer-depression-v1}}
}
- Downloads last month
- 4
Model tree for avtak/erisk-longformer-depression-v1
Base model
AIMH/mental-longformer-base-4096