Taiyi2-chat / README.md

lingbionlp

Update README.md

3c51fbe verified 5 months ago

preview code

raw

history blame contribute delete

3.56 kB

metadata

license: apache-2.0

Taiyi 2 (太一2): A Bilingual (Chinese and English) Fine-Tuned Large Language Model for Diverse Biomedical Tasks

Demo | Github | Paper | Data

This is the model of Taiyi2 using GLM4-9B as the base model, developed by DUTIR lab.

Project Background

With the rapid development of deep learning technology, large language models (LLMs) like ChatGPT and DeepSeek have made significant progress in the field of natural language processing. In the biomedical domain, large language models can facilitate communication between doctors and patients, provide useful medical information, and hold great potential in areas such as clinical decision support, biomedical knowledge discovery, drug development, and personalized treatment planning. Therefore, this project focuses on developing a multilingual, multi-task large language model tailored for various biomedical scenarios, aiming to achieve high performance with low resource consumption. In October 2023, we released the initial version of a bilingual Chinese-English biomedical large language model—Taiyi. Research efforts have continued, and the development of Taiyi 2 has now been completed, with the model being open-sourced.

Major Updates in Taiyi 2

Compared to the Taiyi 1, Taiyi 2 introduces further research and improvements in areas such as the model backbone, data instructions, and task-specific instructions. The main updates are as follows:

Updated Backbone: Taiyi 2 replaces the original Qwen-7B backbone with GLM4-9B.
High-Quality Data Filtering: Based on dataset annotation guidelines, data quality has been further refined by removing low-quality samples. Additionally, the data distribution across different tasks has been rebalanced to address extreme imbalances.
Refined Task Instructions: Tasks are categorized by type, and experimental testing was conducted to evaluate various instruction construction methods. This led to the development of a refined, task-optimized instruction design strategy.

Model Inference

The environment configuration we used for training and testing is as follows:

torch==2.4.0
ms_swift==2.6.1
transformers==4.44.0
transformers-stream-generator==0.0.5
vllm==0.6.0
vllm-flash-attn==2.6.1

To install all dependencies automatically using the command:

$ pip install -r requirements.txt

Model Inference

Referring to the taiyi2_chat.py file, it is recommended to use a GPU to ensure faster inference speed.

Citation

If you use the repository of this project, please cite it.

@article{Taiyi,
  title="{Taiyi: A Bilingual Fine-Tuned Large Language Model for Diverse Biomedical Tasks}",
  author={Ling Luo, Jinzhong Ning, Yingwen Zhao, Zhijun Wang, Zeyuan Ding, Peng Chen, Weiru Fu, Qinyu Han, Guangtao Xu, Yunzhi Qiu, Dinghao Pan, Jiru Li, Hao Li, Wenduo Feng, Senbo Tu, Yuqi Liu, Zhihao Yang, Jian Wang, Yuanyuan Sun, Hongfei Lin},
  journal={Journal of the American Medical Informatics Association},
  year={2024},
  doi = {10.1093/jamia/ocae037},
  url = {https://doi.org/10.1093/jamia/ocae037},
}