shallowblueQAQ commited on
Commit
88e53c2
·
verified ·
1 Parent(s): bcf7dc4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +115 -3
README.md CHANGED
@@ -1,3 +1,115 @@
1
- ---
2
- license: cc-by-nc-4.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-4.0
3
+ tags:
4
+ - mental-health
5
+ - social-media
6
+ - life-events
7
+ ---
8
+
9
+ # PsyEvent: Life Event Recognition System
10
+
11
+ This repository contains the models described in the paper **["Tracking Life's Ups and Downs: Mining Life Events from Social Media Posts for Mental Health Analysis"](https://aclanthology.org/2025.acl-long.345/)** (ACL 2025).
12
+
13
+ The system consists of two distinct models housed in this repository:
14
+ 1. **Life Events Detection (`LE_detection`)**: A multi-label classifier that identifies 12 categories of life events from social media posts.
15
+ 2. **Self-Status Determination (`Self-status_determination`)**: A binary classifier that determines whether the detected life event is currently being experienced by the user themselves (Self) or someone else.
16
+
17
+ ## Model Organization
18
+
19
+ This repository uses **subfolders** to store the weights for each model. You must specify the `subfolder` argument when loading.
20
+
21
+ - `LE_detection/`: Contains the Life Event Detection model.
22
+ - `Self-status_determination/`: Contains the Self-Status Determination model.
23
+
24
+ Both models share the same architecture (`BERTDiseaseClassifier`) defined in `model.py`.
25
+
26
+ ## Usage
27
+
28
+ Since these models use a custom architecture (BERT + Linear Head on `[CLS]` token without pooling), **you must define or import the model class locally** before loading the weights.
29
+
30
+ ### 1. Installation
31
+
32
+ ```bash
33
+ pip install transformers torch huggingface_hub
34
+ ```
35
+ ### 2. Define the Model Architecture
36
+
37
+ You can download the model.py file from this repository, or simply define the class in your code as shown below:
38
+ ```python
39
+ import torch
40
+ from torch import nn
41
+ from transformers import AutoModel, AutoConfig, AutoTokenizer
42
+
43
+ class BERTDiseaseClassifier(nn.Module):
44
+ def __init__(self, model_type, num_symps) -> None:
45
+ super().__init__()
46
+ self.model_type = model_type
47
+ self.num_symps = num_symps
48
+ self.encoder = AutoModel.from_pretrained(model_type)
49
+ self.dropout = nn.Dropout(self.encoder.config.hidden_dropout_prob)
50
+ self.clf = nn.Linear(self.encoder.config.hidden_size, num_symps)
51
+
52
+ def forward(self, input_ids=None, attention_mask=None, token_type_ids=None, **kwargs):
53
+ outputs = self.encoder(input_ids, attention_mask, token_type_ids)
54
+ x = outputs.last_hidden_state[:, 0, :] # [CLS] pooling
55
+ x = self.dropout(x)
56
+ logits = self.clf(x)
57
+ return logits
58
+ ```
59
+
60
+ ### 3. Load the Models
61
+ Use the subfolder parameter to select which model you want to load.
62
+ ```python
63
+ import torch
64
+ from transformers import AutoConfig, AutoTokenizer
65
+ from huggingface_hub import hf_hub_download
66
+ # from model import BERTDiseaseClassifier
67
+
68
+ repo_id = "shallowblueQAQ/psyevent-model"
69
+ subfolder = "LE_detection"
70
+ # subfolder = "Self-status_determination"
71
+
72
+ # 1. Load Config & Tokenizer
73
+ config = AutoConfig.from_pretrained(repo_id, subfolder=subfolder)
74
+ tokenizer = AutoTokenizer.from_pretrained(repo_id, subfolder=subfolder)
75
+
76
+ # 2. Initialize Model Architecture
77
+ model = BERTDiseaseClassifier(model_type=config._name_or_path, num_symps=len(config.id2label))
78
+
79
+ # 3. Load Weights
80
+ weights_path = hf_hub_download(repo_id=repo_id, subfolder=subfolder, filename="pytorch_model.bin")
81
+ model.load_state_dict(torch.load(weights_path, map_location="cpu"))
82
+ model.eval()
83
+
84
+ # 4. Inference
85
+ text = "I lost my job yesterday and I feel terrible."
86
+ inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)
87
+
88
+ with torch.no_grad():
89
+ logits = model(**inputs)
90
+ probs = torch.sigmoid(logits)
91
+
92
+ # Display Predictions (Multi-label)
93
+ threshold = 0.5
94
+ for i, prob in enumerate(probs[0]):
95
+ if prob > threshold:
96
+ print(f"Detected: {config.id2label[i]} ({prob:.4f})")
97
+ ```
98
+
99
+ ## Data Availability & Privacy Statement
100
+
101
+ This model was trained on a subset of the **SMHD (Self-reported Mental Health Diagnoses)** dataset.
102
+
103
+ **Due to the strict Data Usage Agreement of SMHD, we are prohibited from publishing or sharing any proportion of the original dataset (including our annotated subset).** Researchers interested in reproducing this work or using the data must apply for access directly from the original creators of [SMHD (Cohan et al., 2018)](https://aclanthology.org/C18-1126/). We only provide the model weights and inference code here.
104
+
105
+ ### Citation
106
+ If you use this model or dataset, please cite our paper:
107
+ ```bibtex
108
+ @inproceedings{lv2025tracking,
109
+ title={Tracking life’s ups and downs: Mining life events from social media posts for mental health analysis},
110
+ author={Lv, Minghao and Chen, Siyuan and Jin, Haoan and Yuan, Minghao and Ju, Qianqian and Peng, Yujia and Zhu, Kenny and Wu, Mengyue},
111
+ booktitle={Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
112
+ pages={6950--6965},
113
+ year={2025}
114
+ }
115
+ ```