File size: 2,517 Bytes
289325a
 
 
 
 
e12353f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
289325a
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
---
license: mit
language:
- en
---
# 🧠 DistilBERT Response Type Classifier

This is a fine-tuned [DistilBERT](https://huggingface.co/distilbert-base-uncased) model designed to classify patient messages into one of four mental health support categories:

- **advice**
- **information**
- **question**
- **validation**

It is used as part of the [Mental Health Counselor Assistant](https://huggingface.co/spaces/scdong/counselor-assistant) app to help generate helpful, therapeutic responses.

## 💼 Use Case

Given a short text input from a patient, this model predicts the most appropriate **type of response** a mental health counselor might provide.

### Example:
```python
from transformers import DistilBertForSequenceClassification, DistilBertTokenizerFast
import torch

model = DistilBertForSequenceClassification.from_pretrained("scdong/distilbert-response-type")
tokenizer = DistilBertTokenizerFast.from_pretrained("scdong/distilbert-response-type")

text = "I just feel so overwhelmed lately"
inputs = tokenizer(text, return_tensors="pt")
with torch.no_grad():
    logits = model(**inputs).logits

predicted_label = torch.argmax(logits, dim=1).item()
print(predicted_label)  # Maps to: 0=advice, 1=information, 2=question, 3=validation
```

The model is used to route text to custom prompt templates like:
- *Advice prompt*: “You are a licensed counselor. What supportive advice would you give to someone who said: {msg}?”
- *Validation prompt*: “You are an empathetic therapist. Validate the client’s emotions in response to: {msg}”

## 📁 Files

This repo includes:
- `config.json` — model architecture config
- `model.safetensors` — trained model weights
- `tokenizer_config.json`, `tokenizer.json`, `vocab.txt` — tokenizer files
- `special_tokens_map.json` — optional token mappings
- `training_args.bin` — training metadata (optional)

## 🧪 Training Details

The model was fine-tuned using a balanced dataset labeled with response types based on:
- [Kaggle Mental Health Conversations](https://www.kaggle.com/datasets/ayaanalahmed/mental-health-conversations)
- [CounselChat dataset](https://github.com/nbertagnolli/counsel-chat)
- [PAIR dataset](https://lit.eecs.umich.edu/downloads.html#PAIR)

The final model was validated on a held-out test set and integrated into the counselor assistant tool.

## 📜 License

This model is released under an open license for research and educational purposes. Please use responsibly and do not deploy for unsupervised clinical use.