Rx-ZJU
/

LLMDF-EXP

@@ -6,17 +6,18 @@ base_model:
 - zai-org/chatglm3-6b
 pipeline_tag: text-generation
 ---
-## 简介
-用于体重管理饮食反馈任务的大语言模型-拓展版（Large Language Model for Dietary Feedback Expand Version, LLMDF-EXP）是使用真实世界数据引导的合成数据（Human Data Induced Synthetic Data Generation, HDI-SDG）微调得到的模型。具体微调方式见论文：**Using Large Language Models to Generate Dietary Feedback Similar to Human Experts in Weight Management: Based on Real-world Scenario Data**
-## 特性
-我们收集了20个真实的饮食反馈场景，并让人类专家编写了饮食反馈作为基准（reference text），比较了LLMDF-EXP和[LLMDF](https://huggingface.co/Rx-ZJU/LLMDF)、[ChatGLM3-6B](https://huggingface.co/zai-org/chatglm3-6b)、GPT-4、GPT-3.5生成饮食反馈的表现，其中ChatGLM3-6B是LLMDF-EXP微调的基座模型，LLMDF则是HDI-SDG与直接真实世界数据微调对比的baseline模型，GPT-3.5是HDI-SDG方法时使用的大模型，GPT-4则是在OpenAI公布的GPT-3.5后一代模型。
-我们同时进行了机器评估和人类评估，在机器评估方面：我们使用了基于词元的评估指标BLEU、ROUGE、METEOR以及[基于预训练语言模型](https://huggingface.co/google-bert/bert-base-chinese)的BERTScore指标，在人类评估方面：我们使用了李克特量表（Likert Scale）对生成饮食反馈的同理心、完整性、专业性、实用性进行了评估，具体评估过程见上述论文。
-机器评估与人类评估的结果显示，**LLMDF-EXP**生成的饮食反馈与人类专家精心编写的最为接近：同时避免了**真实世界数据质量限制导致的提供充足信息的能力的缺失（[LLMDF的缺陷](https://huggingface.co/Rx-ZJU/LLMDF)）以及通用大语言模型机器生成痕迹过于明显的缺陷（未整合真实场景数据的大模型，ChatGLM3、GPT模型生成的饮食反馈往往过于冗长且表达形式化）**，经过人类评估的统计以及定性分析，**LLMDF-EXP能够生成综合性、具有实践性、人性化的饮食反馈，具有最好的真实场景适应性**
-相关的评估指标如下表所示：
 | Model      | BLEU-1 | ROUGE-L | METEOR | P<sub>BERT</sub> | R<sub>BERT</sub> | F<sub>BERT</sub> |
 |------------|--------|---------|--------|------------------|------------------|------------------|
@@ -33,17 +34,17 @@ pipeline_tag: text-generation
 | **Professionalism:** The feedback is nutritionally accurate, tailored and soundly responses to the user.| 3.93±0.77  | 2.86±0.83  | 3.83±0.57  | 3.41±0.72   | 3.76±0.58 | 3.63±0.66 |
 | **Usefulness:** I can use it as a template to write my feedback to this scene.| 3.70±0.78  | 2.73±0.77  | 3.79±0.72  | 3.03±0.57   | 3.50±0.72 | 3.30±0.68 |
-## 使用方式
-LLMDF-EXP是由[ChatGLM3-6B](https://huggingface.co/zai-org/chatglm3-6b)微调得到的，当使用Hugging Face的``transformers``进行推理时，需要注意构造与训练时对话格式一致的[提示词模版](https://github.com/zai-org/ChatGLM3/blob/main/PROMPT.md)
 ```
 <|user|>
-想象你是一位营养师，你将收到进行饮食管理的客户的饮食信息。你的任务是提供对客户饮食信息的反馈，你需要提供有见解性的内容，给客户饮食指导，引导客户进行健康的生活方式。你可以参考从以下角度组织你给客户的饮食反馈：分析客户摄入食物的属性、提供健康食物的选择、帮助客户制定饮食模式、鼓励客户维持健康的饮食习惯、建议客户修正饮食习惯。<MealInformation><MealType>早餐</MealType><MealIntakeInformation>豆浆 1碗;鸡蛋 1中只;猪肉玉米胡萝卜糯米饭 200克;</MealIntakeInformation></MealInformation><PersonalHealthInformation><Gender>男</Gender><Age>19</Age><Height>180cm</Height><Weight>84.0kg</Weight><Waistline>85cm</Waistline><LaborIntensity>中等劳动强度</LaborIntensity></PersonalHealthInformation>
 <|assistant|>
 ```
-当使用上述示例的提示词时，LLMDF-EXP的生成结果为：
 ```
 <|user|>
-想象你是一位营养师，你将收到进行饮食管理的客户的饮食信息。你的任务是提供对客户饮食信息的反馈，你需要提供有见解性的内容，给客户饮食指导，引导客户进行健康的生活方式。你可以参考从以下角度组织你给客户的饮食反馈：分析客户摄入食物的属性、提供健康食物的选择、帮助客户制定饮食模式、鼓励客户维持健康的饮食习惯、建议客户修正饮食习惯。<MealInformation><MealType>早餐</MealType><MealIntakeInformation>豆浆 1碗;鸡蛋 1中只;猪肉玉米胡萝卜糯米饭 200克;</MealIntakeInformation></MealInformation><PersonalHealthInformation><Gender>男</Gender><Age>19</Age><Height>180cm</Height><Weight>84.0kg</Weight><Waistline>85cm</Waistline><LaborIntensity>中等劳动强度</LaborIntensity></PersonalHealthInformation>
 <|assistant|>
-<DietFeedback>早餐的搭配还算合理，但建议减少猪肉和米饭的摄入量，增加蔬菜的摄入，例如加入一份青菜或者番茄。此外，如果喜欢喝豆浆，可以选择无糖的豆浆或者低脂的豆浆来替代普通豆浆。</DietFeedback>
 ```

 - zai-org/chatglm3-6b
 pipeline_tag: text-generation
 ---
+\[ English | [中文](README_zh.md) \]
+## Introduction
+The Large Language Model for Dietary Feedback - Expanded Version (LLMDF-EXP) is fine-tuned using synthetic data produced by Human Data Induced Synthetic Data Generation (HDI-SDG) approach. For specific details on the fine-tuning process, please refer to the paper: "Using Large Language Models to Generate Dietary Feedback Similar to Human Experts in Weight Management: Based on Real-World Scenario Data."
+## Features
+We collected 20 real dietary feedback scenarios and took human experts crafting dietary feedback as reference text. We compared the performance of LLMDF-EXP with [LLMDF](https://huggingface.co/Rx-ZJU/LLMDF), [ChatGLM3-6B](https://huggingface.co/zai-org/chatglm3-6b), GPT-4, and GPT-3.5 under the same settings. Among them, ChatGLM3-6B is the base model fine-tuned for LLMDF-EXP, LLMDF is the baseline model to compare HDI-SDG with directly using real-world data, GPT-3.5 is the large language model used in the HDI-SDG method, and GPT-4 is the landmark model released by OpenAI after GPT-3.5.
+We conducted both machine and human evaluations. For machine evaluations, we used word overlap based evaluation metrics such as BLEU, ROUGE, METEOR, and [BERTScore based on pretrained language models](https://huggingface.co/google-bert/bert-base-chinese). For human evaluations, we assessed the empathy, completeness, professionalism, and usefulness of the generated dietary feedback using the Likert Scale, as detailed in the aforementioned paper.
+Results from both machine and human evaluations showed that **LLMDF-EXP** generated dietary feedback most closely resembling those crafted by human experts. It **avoided the limitations of insufficient information due to real-world data quality constraints ([a flaw of LLMDF](https://huggingface.co/Rx-ZJU/LLMDF)) and the overly apparent traces of machine generation in generalized large language models (feedback from models like ChatGLM3 and GPT was often excessively verbose and formalized).** From the insights of statistical and qualitative analyses of human evaluations, **LLMDF-EXP is capable of generating comprehensive, practical, and humanized dietary feedback with the best situational adaptability.**
+The relevant evaluation metrics are shown in the table below:
 | Model      | BLEU-1 | ROUGE-L | METEOR | P<sub>BERT</sub> | R<sub>BERT</sub> | F<sub>BERT</sub> |
 |------------|--------|---------|--------|------------------|------------------|------------------|
 | **Professionalism:** The feedback is nutritionally accurate, tailored and soundly responses to the user.| 3.93±0.77  | 2.86±0.83  | 3.83±0.57  | 3.41±0.72   | 3.76±0.58 | 3.63±0.66 |
 | **Usefulness:** I can use it as a template to write my feedback to this scene.| 3.70±0.78  | 2.73±0.77  | 3.79±0.72  | 3.03±0.57   | 3.50±0.72 | 3.30±0.68 |
+## Inference
+LLMDF-EXP is fine-tuned from [ChatGLM3-6B](https://huggingface.co/zai-org/chatglm3-6b). When using Hugging Face's `transformers` for inference, it is necessary to pay attention to constructing a [prompt template](https://github.com/zai-org/ChatGLM3/blob/main/PROMPT.md) consistent with the dialogue format used during training.
 ```
 <|user|>
+Imagine you are a dietitian and you will receive dietary information from a client who is undergoing dietary management. Your task is to give the client feedback on the dietary information, and you need to provide insightful content to give the client dietary guidance to guide him towards a healthy lifestyle. \nYou can refer to organizing the dietary feedback from the following perspectives: analyzing the attributes of the food consumed by the client, providing healthy food choices, helping the client to develop a dietary pattern, encouraging the client to maintain healthy eating habits, and advising the client to modify his or her dietary habits.<MealInformation><MealType>Breakfast</MealType><MealIntakeInformation>Soy milk 1 bowl; 1 medium egg; pork, corn, and carrot glutinous rice 200 grams;</MealIntakeInformation></MealInformation><PersonalHealthInformation><Gender>Male</Gender><Age>19</Age><Height>180cm</Height><Weight>84.0kg</Weight><Waistline>85cm</Waistline><LaborIntensity>Medium</LaborIntensity></PersonalHealthInformation>
 <|assistant|>
 ```
+When using the prompt of the above example, the generation result of LLMDF-EXP is:
 ```
 <|user|>
+Imagine you are a dietitian and you will receive dietary information from a client who is undergoing dietary management. Your task is to give the client feedback on the dietary information, and you need to provide insightful content to give the client dietary guidance to guide him towards a healthy lifestyle. \nYou can refer to organizing the dietary feedback from the following perspectives: analyzing the attributes of the food consumed by the client, providing healthy food choices, helping the client to develop a dietary pattern, encouraging the client to maintain healthy eating habits, and advising the client to modify his or her dietary habits.<MealInformation><MealType>Breakfast</MealType><MealIntakeInformation>Soy milk 1 bowl; 1 medium egg; pork, corn, and carrot glutinous rice 200 grams;</MealIntakeInformation></MealInformation><PersonalHealthInformation><Gender>Male</Gender><Age>19</Age><Height>180cm</Height><Weight>84.0kg</Weight><Waistline>85cm</Waistline><LaborIntensity>Medium</LaborIntensity></PersonalHealthInformation>
 <|assistant|>
+<DietFeedback>The breakfast combination is fairly reasonable, but it is recommended to reduce the intake of pork and rice, and increase the intake of vegetables, such as adding a portion of greens or tomatoes. In addition, if you like drinking soy milk, you can choose unsweetened soy milk or low-fat soy milk to replace regular soy milk.</DietFeedback>
 ```

README_zh.md ADDED Viewed

	@@ -0,0 +1,50 @@

+---
+language:
+- zh
+- en
+base_model:
+- zai-org/chatglm3-6b
+pipeline_tag: text-generation
+---
+\[ [English](README.md) | 中文 \]
+## 简介
+用于体重管理饮食反馈任务的大语言模型-拓展版（Large Language Model for Dietary Feedback Expand Version, LLMDF-EXP）是使用真实世界数据引导的合成数据（Human Data Induced Synthetic Data Generation, HDI-SDG）微调得到的模型。具体微调方式见论文：**Using Large Language Models to Generate Dietary Feedback Similar to Human Experts in Weight Management: Based on Real-world Scenario Data**
+## 特性
+我们收集了20个真实的饮食反馈场景，并让人类专家编写了饮食反馈作为基准（reference text），比较了LLMDF-EXP和[LLMDF](https://huggingface.co/Rx-ZJU/LLMDF)、[ChatGLM3-6B](https://huggingface.co/zai-org/chatglm3-6b)、GPT-4、GPT-3.5生成饮食反馈的表现，其中ChatGLM3-6B是LLMDF-EXP微调的基座模型，LLMDF则是HDI-SDG与直接真实世界数据微调对比的baseline模型，GPT-3.5是HDI-SDG方法时使用的大模型，GPT-4则是在OpenAI公布的GPT-3.5后一代模型。
+我们同时进行了机器评估和人类评估，在机器评估方面：我们使用了基于词元的评估指标BLEU、ROUGE、METEOR以及[基于预训练语言模型](https://huggingface.co/google-bert/bert-base-chinese)的BERTScore指标，在人类评估方面：我们使用了李克特量表（Likert Scale）对生成饮食反馈的同理心、完整性、专业性、实用性进行了评估，具体评估过程见上述论文。
+机器评估与人类评估的结果显示，**LLMDF-EXP**生成的饮食反馈与人类专家精心编写的最为接近：同时避免了**真实世界数据质量限制导致的提供充足信息的能力的缺失（[LLMDF的缺陷](https://huggingface.co/Rx-ZJU/LLMDF)）以及通用大语言模型机器生成痕迹过于明显的缺陷（未整合真实场景数据的大模型，ChatGLM3、GPT模型生成的饮食反馈往往过于冗长且表达形式化）**，经过人类评估的统计以及定性分析，**LLMDF-EXP能够生成综合性、具有实践性、人性化的饮食反馈，具有最好的真实场景适应性**
+相关的评估指标如下表所示：
+| Model      | BLEU-1 | ROUGE-L | METEOR | P<sub>BERT</sub> | R<sub>BERT</sub> | F<sub>BERT</sub> |
+|------------|--------|---------|--------|------------------|------------------|------------------|
+| GPT-4      | 0.084  | 0.112   | 0.170  | 0.583            | 0.693            | 0.633            |
+| GPT-3.5    | 0.105  | 0.136   | 0.177  | 0.590            | 0.701            | 0.641            |
+| ChatGLM3-6B | 0.092  | 0.115   | 0.170  | 0.587            | 0.688            | 0.633            |
+| LLMDF      | 0.074  | 0.158   | 0.076  | **0.705**        | 0.619            | 0.658            |
+| LLMDF-EXP  | **0.294** | **0.213** | **0.191** | 0.682            | **0.708**        | **0.694**        |
+|                      | Experts    | LLMDF      | LLMDF-EXP  | ChatGLM3-6B | GPT-3.5   | GPT-4     |
+|----------------------|------------|------------|------------|-------------|-----------|-----------|
+| **Empathy:** The feedback expresses appropriate empathy to the user.| 3.63±0.51  | 3.73±0.45  | 3.79±0.44  | 3.79±0.41   | 3.85±0.36 | 3.90±0.30 |
+| **Completeness:** The feedback provides sufficiently comprehensive information.| 3.81±0.50  | 3.05±0.79  | 3.99±0.43  | 4.11±0.63   | 4.22±0.45 | 4.26±0.49 |
+| **Professionalism:** The feedback is nutritionally accurate, tailored and soundly responses to the user.| 3.93±0.77  | 2.86±0.83  | 3.83±0.57  | 3.41±0.72   | 3.76±0.58 | 3.63±0.66 |
+| **Usefulness:** I can use it as a template to write my feedback to this scene.| 3.70±0.78  | 2.73±0.77  | 3.79±0.72  | 3.03±0.57   | 3.50±0.72 | 3.30±0.68 |
+## 使用方式
+LLMDF-EXP是由[ChatGLM3-6B](https://huggingface.co/zai-org/chatglm3-6b)微调得到的，当使用Hugging Face的``transformers``进行推理时，需要注意构造与训练时对话格式一致的[提示词模版](https://github.com/zai-org/ChatGLM3/blob/main/PROMPT.md)
+```
+<|user|>
+想象你是一位营养师，你将收到进行饮食管理的客户的饮食信息。你的任务是提供对客户饮食信息的反馈，你需要提供有见解性的内容，给客户饮食指导，引导客户进行健康的生活方式。你可以参考从以下角度组织你给客户的饮食反馈：分析客户摄入食物的属性、提供健康食物的选择、帮助客户制定饮食模式、鼓励客户维持健康的饮食习惯、建议客户修正饮食习惯。<MealInformation><MealType>早餐</MealType><MealIntakeInformation>豆浆 1碗;鸡蛋 1中只;猪肉玉米胡萝卜糯米饭 200克;</MealIntakeInformation></MealInformation><PersonalHealthInformation><Gender>男</Gender><Age>19</Age><Height>180cm</Height><Weight>84.0kg</Weight><Waistline>85cm</Waistline><LaborIntensity>中等劳动强度</LaborIntensity></PersonalHealthInformation>
+<|assistant|>
+```
+当使用上述示例的提示词时，LLMDF-EXP的生成结果为：
+```
+<|user|>
+想象你是一位营养师，你将收到进行饮食管理的客户的饮食信息。你的任务是提供对客户饮食信息的反馈，你需要提供有见解性的内容，给客户饮食指导，引导客户进行健康的生活方式。你可以参考从以下角度组织你给客户的饮食反馈：分析客户摄入食物的属性、提供健康食物的选择、帮助客户制定饮食模式、鼓励客户维持健康的饮食习惯、建议客户修正饮食习惯。<MealInformation><MealType>早餐</MealType><MealIntakeInformation>豆浆 1碗;鸡蛋 1中只;猪肉玉米胡萝卜糯米饭 200克;</MealIntakeInformation></MealInformation><PersonalHealthInformation><Gender>男</Gender><Age>19</Age><Height>180cm</Height><Weight>84.0kg</Weight><Waistline>85cm</Waistline><LaborIntensity>中等劳动强度</LaborIntensity></PersonalHealthInformation>
+<|assistant|>
+<DietFeedback>早餐的搭配还算合理，但建议减少猪肉和米饭的摄入量，增加蔬菜的摄入，例如加入一份青菜或者番茄。此外，如果喜欢喝豆浆，可以选择无糖的豆浆或者低脂的豆浆来替代普通豆浆。</DietFeedback>
+```