ZhangCNN
/

MindGLM

@@ -20,12 +20,44 @@ To use MindGLM with the Hugging Face Transformers library:
     from transformers import AutoTokenizer, AutoModelForCausalLM
     tokenizer = AutoTokenizer.from_pretrained("ZhangCNN/MindGLM")
     model = AutoModelForCausalLM.from_pretrained("ZhangCNN/MindGLM")
-    input_text = "Your input text here"
-    input_ids = tokenizer.encode(input_text, return_tensors="pt")
-    output = model.generate(input_ids)
-    decoded_output = tokenizer.decode(output[0], skip_special_tokens=True)
-    print(decoded_output)
 '
@@ -58,4 +90,96 @@ For any queries, feedback, or collaboration opportunities, please reach out to:
 - Email: [zcm200605@163.com]
 - wechat: [Zhang_CNN]
 - Affiliation: [university of glasgow]
-- We hope MindGLM proves to be a valuable asset in the realm of digital psychological counseling for the Chinese-speaking community. Your feedback and contributions are always welcome!

     from transformers import AutoTokenizer, AutoModelForCausalLM
     tokenizer = AutoTokenizer.from_pretrained("ZhangCNN/MindGLM")
     model = AutoModelForCausalLM.from_pretrained("ZhangCNN/MindGLM")
+    history = []
+    end_command = "结束对话"
+    max_length = 600
+    max_turns = 12
+    instruction = ""
+    instruction = "假设你是友善的心理辅导师，避免透露任何AI或技术的信息。请主动引导咨询者交流并帮助他们解决心理问题。"
+    # 获取用户输入
+    prompt = input("求助者：")
+    # 添加前缀
+    prompt = instruction + "求助者：" + prompt + "。支持者:"
+    response, history = model.chat(tokenizer, prompt, history=[])
+    print("支持者: " + response)
+    while True:
+      # 获取用户输入
+      user_input = input("求助者：")
+      # 检查是否收到结束指令
+      if user_input == end_command:
+          break
+      # 添加前缀
+      prompt = "求助者：" + user_input + "。支持者:"
+      response, history = model.chat(tokenizer, prompt, history=history)
+      print("支持者: " + response)
+      # 计算历史记录总字符长度，包括新对话
+      total_length = len(instruction) + sum(len(f'{item[0]} {item[1]}') for item in history)
+      # 如果历史记录字符数超过限制或者对话轮数超过限制，删除一些旧的对话
+      while total_length > max_length or len(history) > max_turns:
+        # 删除最旧的一条对话
+        removed_item = history.pop(1)  # 第一条是instruction，删除第二条
+        # print('删除:',removed_item)
+        total_length -= len(f'{removed_item[0]} {removed_item[1]}')
 '
 - Email: [zcm200605@163.com]
 - wechat: [Zhang_CNN]
 - Affiliation: [university of glasgow]
+- We hope MindGLM proves to be a valuable asset in the realm of digital psychological counseling for the Chinese-speaking community. Your feedback and contributions are always welcome!
+# MindGLM: 针对中文心理咨询任务的对齐大语言模型
+1. 简介
+MindGLM 是针对中文心理咨询任务进行微调和对齐的大型语言模型。MindGLM 由基础模型 ChatGLM2-6B 发展而来，旨在与人类的心理咨询偏好产生共鸣，为数字心理咨询提供可靠、安全的工具。
+2. 主要特点
+- 针对心理咨询进行微调： MindGLM 经过细致的训练，能够理解和回应心理咨询，确保以同理心做出准确的回应。
+- 符合人类偏好： 该模型经过了严格的调整过程，确保其响应符合心理咨询领域的人类价值观和偏好。
+- 高性能： MindGLM 在定量和定性评估中均表现出卓越的性能，使其成为数字心理干预的首选。
+3. 使用方法
+将 MindGLM 与 "transformer "一起使用：
+'
+    from transformers import AutoTokenizer, AutoModelForCausalLM
+    tokenizer = AutoTokenizer.from_pretrained("ZhangCNN/MindGLM")
+    model = AutoModelForCausalLM.from_pretrained("ZhangCNN/MindGLM")
+    history = []
+    end_command = "结束对话"
+    max_length = 600
+    max_turns = 12
+    instruction = ""
+    instruction = "假设你是友善的心理辅导师，避免透露任何AI或技术的信息。请主动引导咨询者交流并帮助他们解决心理问题。"
+    # 获取用户输入
+    prompt = input("求助者：")
+    # 添加前缀
+    prompt = instruction + "求助者：" + prompt + "。支持者:"
+    response, history = model.chat(tokenizer, prompt, history=[])
+    print("支持者: " + response)
+    while True:
+      # 获取用户输入
+      user_input = input("求助者：")
+      # 检查是否收到结束指令
+      if user_input == end_command:
+          break
+      # 添加前缀
+      prompt = "求助者：" + user_input + "。支持者:"
+      response, history = model.chat(tokenizer, prompt, history=history)
+      print("支持者: " + response)
+      # 计算历史记录总字符长度，包括新对话
+      total_length = len(instruction) + sum(len(f'{item[0]} {item[1]}') for item in history)
+      # 如果历史记录字符数超过限制或者对话轮数超过限制，删除一些旧的对话
+      while total_length > max_length or len(history) > max_turns:
+        # 删除最旧的一条对话
+        removed_item = history.pop(1)  # 第一条是instruction，删除第二条
+        # print('删除:',removed_item)
+        total_length -= len(f'{removed_item[0]} {removed_item[1]}')
+'
+4. 训练数据
+MindGLM 结合使用开源数据集和自建数据集进行训练，以确保全面了解心理咨询场景。这些数据集包括 SmileConv、comparison_data_v1、psychology-RLAIF、rm_labelled_180 和 rm_gpt_375。
+5. 训练过程
+该模型采用了三阶段训练方法：（均使用了LoRA）
+监督微调： 使用 ChatGLM2-6B 基础模型，用心理咨询专用数据集对 MindGLM 进行微调。
+奖励模型训练： 对奖励模型进行训练，以对微调模型的响应进行评估和评分。
+强化学习： 使用 PPO（近端策略优化）算法对模型进行进一步调整，以确保其反应符合人类的偏好。
+6. 局限性
+虽然 MindGLM 是一款功能强大的工具，但用户也应了解其局限性：
+它是为心理咨询而设计的，但不应取代专业医疗建议或干预。
+模型的反应是基于训练数据的，虽然它与人类的偏好相一致，但可能并不总是提供最合适的反应。
+7. 使用许可
+请参考用于训练的数据集的许可条款。MindGLM的使用应遵守这些许可。
+8. 联系信息
+如有任何疑问、反馈或合作机会，请联系：
+- 姓名： [张淙冕］
+- 电子邮件 [zcm200605@163.com]
+- 微信 [Zhang_CNN］
+- 所属单位： [格拉斯哥大学］
+- 我们希望 MindGLM 能够成为中文社区数字心理咨询领域的宝贵财富。我们随时欢迎您的反馈和贡献！