MADRS-BERT / README.md
webesama's picture
Update README.md
3046351 verified
---
license: cc-by-nc-4.0
language:
- de
base_model:
- google-bert/bert-base-german-cased
pipeline_tag: text-classification
tags:
- depression
- mental-health
- MADRS
- clinical
- interview
---
# MADRS-BERT
**MADRS-BERT** is a fine-tuned `bert-base-german-cased` model that predicts depression severity scores (0–6) across individual items of the [Montgomery-Åsberg Depression Rating Scale (MADRS)](https://en.wikipedia.org/wiki/MADRS). Each prediction is based on transcribed, structured clinician–patient interview segments.
- **Publication**: [https://www.nature.com/articles/s41746-025-01982-8#Sec8](https://www.nature.com/articles/s41746-025-01982-8#Sec8)
- **Example dataset**: [https://github.com/webersamantha/MADRS-BERT/data](https://github.com/webersamantha/MADRS-BERT/data)
- **Github Repo**: The code for data curation, finetuning and evaluation is shared in the following github repo: [https://github.com/webersamantha/MADRS-BERT](https://github.com/webersamantha/MADRS-BERT)
This model was developed to support standardized, scalable mental health assessments in both clinical and low-resource settings.
## Model Details
- **Base model**: `bert-base-german-cased`
- **Task**: Ordinal regression (scores 0–6)
- **Language**: German
- **Input**: Text (dialogue segment grouped by MADRS topic)
- **Output**: Predicted score for each MADRS item (rounded integer 0–6)
- **Training data**: Mix of real and synthetic clinician–patient interviews (MADRS-structured)
## Intended Use
This model is intended for research and development use. It is not a certified medical device. The goal is to:
- Explore AI-assisted symptom severity assessment
- Enable structured evaluation of individual MADRS items
- Support clinicians or researchers working in psychiatry/mental health
---
## 🚀 How to Use
### Preprocess Data File:
Please organize your data equivalent to the example data (synthetic data) with columns: Subject, Speaker, Transcription, Topic, Score.
```python
import pandas as pd
def load_and_prepare_conversations(filepath):
df = pd.read_excel(filepath)
conversations = []
for topic in df['Topic'].unique():
topic_df = df[df['Topic'] == topic]
if topic_df.empty: continue
dialogue = "\n".join([
f"{row['Speaker']}: {row['Transcription']}"
for _, row in topic_df.iterrows()
if pd.notnull(row['Transcription'])
])
conversations.append((topic, dialogue))
return conversations
```
### Load model and tokenizer:
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model_name = "webesama/MADRS-BERT"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
model.eval().to("cuda" if torch.cuda.is_available() else "cpu")
```
### Predict on a full structured interview / Run inference:
Assume you have a conversation log like this:
```python
def predict_madrs_scores(conversations, tokenizer, model):
device = model.device
predictions = {}
for topic, dialogue in conversations:
inputs = tokenizer(dialogue, truncation=True, padding="max_length", max_length=512, return_tensors="pt").to(device)
with torch.no_grad():
score = torch.round(model(**inputs).logits).clamp(0, 6).item()
predictions[topic] = score
return predictions
file_path = "example_interview.xlsx"
conversations = load_and_prepare_conversations(file_path)
scores = predict_madrs_scores(conversations, tokenizer, model)
print(scores)
```
---
## Acknowledgements
Model trained and released by [Samantha Weber](https://github.com/webersamantha) within the framework of the [Multicast Project on predicting and treating suicidality](https://www.multicast.uzh.ch/en.html). Research conducted as part of efforts to improve AI-driven mental health tools. Thanks to all clinicians and collaborators who contributed to the annotated MADRS dataset.
## Evaluation
The model was evaluated on a held-out clinical validation set and achieved strong performance under both strict and flexible scoring criteria (±1 deviation tolerance). See publication for full metrics.
## Citation
If you use this model, please cite:
> Weber, S. et al. (2025). "Using a Fine-tuned Large Language Model for Symptom-based Depression Evaluation" (DOI: https://doi.org/10.1038/s41746-025-01982-8)