File size: 3,427 Bytes

a2a783b
c134413
 
ac8eec2
c134413
 
 
 
fcc2b05
9e6ad63
1844a08
3725cd0
1844a08
9e6ad63
3725cd0
 
 
 
 
 
 
 
 
c934e88
fcc2b05
0f43668
fa9e371
163e15b
d8ea208
fcc2b05
3cc6bb9
81f93d3
0a18f3d
0ac5e53
9f2b6ff
 
fcc2b05
 
 
 
 
7c2d8e7
fcc2b05
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
cdf871b
fcc2b05
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6fd31b0
 
fcc2b05
 
 
 
d8ea208

---
library_name: transformers
license: other
base_model: Qwen/Qwen3-8B
tags:
- llama-factory
- full
- generated_from_trainer
---
<p align="center">
  <img src="https://huggingface.co/GMLHUHE/PsyLLM-8B/resolve/main/logo.jpg"
       alt="PsyLLM logo"
       width="200px" />
</p>

<p align="center">
  <a href="https://arxiv.org/pdf/2505.15715v2">
    <img src="https://img.shields.io/badge/arXiv-2505.15715-b31b1b.svg" alt="arXiv">
  </a>
  <a href="https://github.com/Emo-gml/PsyLLM">
    <img src="https://img.shields.io/badge/GitHub-Emo--gml%2FPsyLLM-blue?logo=github" alt="GitHub">
  </a>
</p>


**PsyLLM** is a large language model designed for **psychological counseling and mental health dialogue generation**. It integrates **diagnostic reasoning** and **therapeutic reasoning**, following established frameworks such as **DSM/ICD**, and incorporates diverse therapeutic approaches including **CBT**, **ACT**, and **psychodynamic therapy**.  

The model is trained on the [**OpenR1-Psy**](https://huggingface.co/datasets/GMLHUHE/OpenR1-Psy) dataset ([arXiv:2505.15715](https://arxiv.org/pdf/2505.15715v2)),  
featuring multi-turn counseling dialogues with explicit reasoning traces that support **clinically informed, empathetic, and interpretable** AI-assisted therapy.

The training process is implemented based on the open-source framework [**LLaMA-Factory**](https://github.com/hiyouga/LLaMA-Factory).
If you find this project helpful, feel free to ⭐ it! [PsyLLM](https://github.com/Emo-gml/PsyLLM)




## 推理示例代码

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "GMLHUHE/PsyLLM-8B"

# load the tokenizer and the model
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(
    model_path,
    torch_dtype="auto",
    device_map="auto"
)

# prepare the model input
prompt = "I have participated in big group sessions before where I was left to find my own safe place, but it hasn't worked for me."
messages = [
    {"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
    enable_thinking=True 
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

# conduct text completion
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=32768
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()

# parsing thinking content
try:
    index = len(output_ids) - output_ids[::-1].index(151668)
except ValueError:
    index = 0

thinking_content = tokenizer.decode(output_ids[:index], skip_special_tokens=True).strip("\n")
content = tokenizer.decode(output_ids[index:], skip_special_tokens=True).strip("\n")

print("PsyLLM thinking content:", thinking_content)
print("PsyLLM content:", content)
```

---

## 📄 Citation

If you use this dataset, please cite:

```bibtex
@article{hu2025beyond,
  title={Beyond Empathy: Integrating Diagnostic and Therapeutic Reasoning with Large Language Models for Mental Health Counseling},
  author={Hu, He and Zhou, Yucheng and Si, Juzheng and Wang, Qianning and Zhang, Hengheng and Ren, Fuji and Ma, Fei and Cui, Laizhong},
  journal={arXiv preprint arXiv:2505.15715},
  year={2025}
}
````

---

## 🧩 License

For **research and educational use only.**

Please ensure compliance with **ethical and legal standards** in mental health AI research.