File size: 10,146 Bytes
b3c0cac ac8a799 b3c0cac 69d93a9 b3c0cac 7928fb2 b3c0cac 5a2621e b3c0cac 7928fb2 b3c0cac 7928fb2 b3c0cac 7928fb2 b3c0cac 7928fb2 b3c0cac 7928fb2 69d93a9 7928fb2 b3c0cac 7928fb2 b3c0cac 7928fb2 b3c0cac 7928fb2 b3c0cac 7928fb2 546a14c 7928fb2 546a14c 7928fb2 b3c0cac 7928fb2 b3c0cac 7928fb2 b3c0cac 7928fb2 b3c0cac ac8a799 b3c0cac 7928fb2 b3c0cac ac8a799 b3c0cac 7928fb2 b3c0cac 7928fb2 b3c0cac 7928fb2 b3c0cac 7928fb2 b3c0cac 7928fb2 b3c0cac 7928fb2 b3c0cac 7928fb2 b3c0cac 7928fb2 b3c0cac 7928fb2 b3c0cac 7928fb2 b3c0cac 7928fb2 04bc674 7928fb2 69d93a9 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 |
---
language:
- zh
- en
- vi
base_model:
- Qwen/Qwen2.5-3B-Instruct
pipeline_tag: text-generation
tags:
- biology
- medical
license: apache-2.0
---
<p align="center">
<br>
<img src="./image/mkty_cn_light_huggingface.svg" style="width:63%;">
</p>
<br>
# Minh Khoe Tue Y LLM (MKTY-3B-Chat)
[](https://doi.org/10.5281/zenodo.17444889)
### 🌍 Documentation Language
[**Chinese Simplified (简体中文)**](./README.zh-CN.md) | [**English**](./README.md) | [**Vietnamese (Tiếng Việt)**](./README.vi.md)
> Please note that the English and Vietnamese versions of this document are translated from the Chinese version using LLM, with manual proofreading. However, discrepancies may still exist. In case of inconsistencies between the English or Vietnamese versions and the Chinese version, the Chinese version shall prevail.
**Full Project Title:** Minh Khoe Tue Y (_Chinese Simplified: 明康慧医_; _Vietnamese: Minh Khỏe Tuệ Y_; _Nom Script: 明劸慧醫_ ) — Design and Implementation of a Health Management and Diagnostic Assistance System Based on LLMs and Multimodal Artificial Intelligence. **Abbreviation:** MKTY Smart Healthcare System
### 📖 Model Overview
This model is a component of the "Minh Khoe Tue Y - Design and Implementation of a Health Management and Assisted Diagnosis System Based on LLM and Multimodal Artificial Intelligence" project (referred to as the Minh Khoe Tue Y Smart Healthcare System). It was developed as part of my undergraduate graduation project for the Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), class of 2025. The project has been open-sourced and is available at: [https://github.com/duyu09/MKTY-System](https://github.com/duyu09/MKTY-System).
This model has been fine-tuned and optimized in the fields of medicine, healthcare, and biology, outperforming its base model, `Qwen2.5-3B-Instruct`. The fine-tuning process employs the LoRA algorithm and is conducted in two stages, focusing solely on the Chinese language. Initially, during the Pretrain phase, the model undergoes incremental training using medical textbooks, medical records, and healthcare-related articles. Subsequently, Supervised Fine-Tuning (SFT) is performed using corpora that include symptoms and corresponding medical records, doctor-patient dialogues (symptom descriptions and diagnoses), medical knowledge Q&A, and dialogue corpora based on the "LLM Discussion Mechanism." The total data volume is approximately `2.88GB`.
Notably, the model has been optimized for the "LLM Discussion Mechanism." The specific operation of this mechanism is as follows: when answering each question, the model generates multiple results based on different contexts, simulating a scenario where "multiple individuals express their viewpoints." The system also includes a "moderator" role responsible for summarizing the viewpoints from each round of discussion. Subsequently, all participants engage in the next round of discussion based on the original question, the moderator's summary, and their respective contexts. This process iterates until the discussion results converge (i.e., the semantics become consistent) or the preset maximum number of discussion rounds is reached.
### 🔧 Hardware Requirements
For GPU inference, a minimum of `7GB` of VRAM is required. If the VRAM capacity is insufficient or if no dedicated GPU is available, the MKTY-3B large model can also run using `CPU` + `7GB RAM`.
### 🚀 Usage Example
Based on the Tongyi Qianwen (Chinese: 通义千问) `Qwen2.5-3B-Instruct` model, it can be quickly loaded and launched using the `transformers` library.
**Model Loading**
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
def load_model_and_tokenizer(model_name):
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
return model, tokenizer
def generate_response(prompt, messages, model, tokenizer, max_new_tokens=2000):
messages.append({"role": "user", "content": prompt})
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=max_new_tokens
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
messages.append({"role": "assistant", "content": response})
return response
```
**Standard Q&A Mode**
```python
if __name__ == "__main__":
model_name = r"MKTY-3B-Chat"
messages = []
model, tokenizer = load_model_and_tokenizer(model_name)
while True:
prompt = input("User> ")
if prompt == "exit":
break
response = generate_response(prompt, messages, model, tokenizer)
print("MKTY>", response)
```
**LLM Discussion Mode** (Example language: Chinese Simplified)
```python
if __name__ == "__main__":
model_name = "MKTY-3B-Chat"
discuss_rounds = 3
agent_number = 3
model, tokenizer = load_model_and_tokenizer(model_name)
messages_arr = [[] for _ in range(agent_number)]
while True:
prompt = input("User> ")
if prompt == "exit":
break
moderator_opinion = "暂无"
for i in range(discuss_rounds):
responses_arr = []
prompt_per_round = "- 问题:\n" + prompt + "\n - 上轮讨论主持人意见:\n" + moderator_opinion + "\n - 请你结合主持人意见,对上述医疗或医学专业的问题发表详细观点,可以质疑并说明理由。\n"
for j in range(agent_number):
messages = messages_arr[j]
response = generate_response(prompt_per_round, messages, model, tokenizer)
responses_arr.append(response)
print(f"第{i + 1}轮讨论,LLM {j + 1}观点>\n", response)
print("-------------------")
moderator_prompt = "- 问题:\n" + prompt + "\n\n"
for res_index in range(len(responses_arr)):
moderator_prompt = moderator_prompt + f"- LLM {res_index + 1}观点:\n" + responses_arr[res_index] + "\n\n"
moderator_prompt = moderator_prompt + "对于给定的医疗相关问题,请综合各LLM观点,结合自身知识,得出你自己的判断,尽可能详尽,全部都分析到位,还要充分说明理由。\n"
moderator_opinion = generate_response(moderator_prompt, [], model, tokenizer)
print(f"第{i + 1}轮讨论,主持人的意见>\n", moderator_opinion)
print("-------------------")
clear_history(messages_arr)
```
## 🎓 Authors
```
██\ ██\ ██\ ██\ ████████\ ██\ ██\
███\ ███ | ██ | ██ | \__██ __| \██\ ██ |
████\ ████ | ██ |██ / ██ | \██\ ██ /
██\██\██ ██ | █████ / ██ | \████ /
██ \███ ██ | ██ ██< ██ | \██ /
██ |\█ /██ | ██ |\██\ ██ | ██ |
██ | \_/ ██ |██\ ██ | \██\ ██\ ██ |██\ ██ |██\
\__| \__|\__|\__| \__|\__| \__|\__| \__|\__|
```
This model is used for the graduation project of the Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences) in 2025, and is only for academic exchange. Neither I nor my supervisor teachers are responsible for any consequences arising from the use of the model.
- **🧑💻 Project Author:**
- **DU Yu** (Chinese: _杜宇_; Vietnamese: _Đỗ Vũ_; Email: <202103180009@stu.qlu.edu.cn>), undergraduate student at Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences)
- **🏫 Thesis Advisors:**
- Academic Advisor: **JIANG Wenfeng** (Chinese: _姜文峰_; Vietnamese: _Khương Văn Phong_), Associate professor, Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences)
- Industry Advisor: **LI Jun** (Chinese: _李君_; Vietnamese: _Lý Quân_), Shandong Strong (Shichuang) Software Training College, Ambow Education Group ([NYSE: AMBO](https://www.nyse.com/quote/XASE:AMBO))
The complete project's open source address: [https://github.com/duyu09/MKTY-System](https://github.com/duyu09/MKTY-System). Welcome to download and discuss about it.
## 🔗 Links
- Qilu University of Technology (Shandong Academy of Sciences): [https://www.qlu.edu.cn/](https://www.qlu.edu.cn/)
- Shandong Computer Center (National Supercomputing Center in Jinan, _NSCCJN_): [https://www.nsccjn.cn/](https://www.nsccjn.cn/)
- Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences): [http://jsxb.scsc.cn/](http://jsxb.scsc.cn/)
- DuYu's GitHub Account: [https://github.com/duyu09/](https://github.com/duyu09/)
## 📄 Citation
```
@software{du_2025_17444889,
author = {Du, Yu},
title = {Minh Khoe Tue Y Smart Healthcare System},
month = oct,
year = 2025,
publisher = {Zenodo},
version = {v1.1.2},
doi = {10.5281/zenodo.17444889},
url = {https://github.com/duyu09/MKTY-System},
swhid = {swh:1:dir:a633243bf04e6ba18e2d5ffcf92ea57f73566f43
;origin=https://doi.org/10.5281/zenodo.17444888;vi
sit=swh:1:snp:37dc91d2c166a07c7dc8ebac0b4be97961b0
267b;anchor=swh:1:rel:a88f82a5ca10d278bcc10734f5cf
a560286a8b47;path=duyu09-MKTY-System-8edd0c9
},
}
```
|