|
|
--- |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- zh |
|
|
- en |
|
|
metrics: |
|
|
- accuracy |
|
|
base_model: |
|
|
- Qwen/Qwen3-32B |
|
|
pipeline_tag: text-generation |
|
|
library_name: transformers |
|
|
tags: |
|
|
- medical |
|
|
model-index: |
|
|
- name: Med-Go-32B |
|
|
results: |
|
|
- task: |
|
|
type: text-generation |
|
|
dataset: |
|
|
type: medical_eval_hle |
|
|
name: Medical-Eval-HLE |
|
|
metrics: |
|
|
- name: accuracy |
|
|
type: accuracy |
|
|
value: 19.4 |
|
|
verified: false |
|
|
- task: |
|
|
type: text-generation |
|
|
dataset: |
|
|
type: supergpqa |
|
|
name: SuperGPQA |
|
|
metrics: |
|
|
- name: accuracy |
|
|
type: accuracy |
|
|
value: 37.2 |
|
|
verified: false |
|
|
- task: |
|
|
type: text-generation |
|
|
dataset: |
|
|
type: medbullets |
|
|
name: Medbullets |
|
|
metrics: |
|
|
- name: accuracy |
|
|
type: accuracy |
|
|
value: 57.8 |
|
|
verified: false |
|
|
- task: |
|
|
type: text-generation |
|
|
dataset: |
|
|
type: mmlu_pro |
|
|
name: MMLU-pro |
|
|
metrics: |
|
|
- name: accuracy |
|
|
type: accuracy |
|
|
value: 64.3 |
|
|
verified: false |
|
|
- task: |
|
|
type: text-generation |
|
|
dataset: |
|
|
type: afrimedqa |
|
|
name: AfrimedQA |
|
|
metrics: |
|
|
- name: accuracy |
|
|
type: accuracy |
|
|
value: 74.7 |
|
|
verified: false |
|
|
- task: |
|
|
type: text-generation |
|
|
dataset: |
|
|
type: medmcqa |
|
|
name: MedMCQA |
|
|
metrics: |
|
|
- name: accuracy |
|
|
type: accuracy |
|
|
value: 68.3 |
|
|
verified: false |
|
|
- task: |
|
|
type: text-generation |
|
|
dataset: |
|
|
type: medqa_usmle |
|
|
name: MedQA-USMLE |
|
|
metrics: |
|
|
- name: accuracy |
|
|
type: accuracy |
|
|
value: 76.8 |
|
|
verified: false |
|
|
- task: |
|
|
type: text-generation |
|
|
dataset: |
|
|
type: cmb |
|
|
name: CMB |
|
|
metrics: |
|
|
- name: accuracy |
|
|
type: accuracy |
|
|
value: 92.5 |
|
|
verified: false |
|
|
- task: |
|
|
type: text-generation |
|
|
dataset: |
|
|
type: cmexam |
|
|
name: CMExam |
|
|
metrics: |
|
|
- name: accuracy |
|
|
type: accuracy |
|
|
value: 87.4 |
|
|
verified: false |
|
|
- task: |
|
|
type: text-generation |
|
|
dataset: |
|
|
type: pubmedqa |
|
|
name: PubMedQA |
|
|
metrics: |
|
|
- name: accuracy |
|
|
type: accuracy |
|
|
value: 76.6 |
|
|
verified: false |
|
|
- task: |
|
|
type: text-generation |
|
|
dataset: |
|
|
type: medexqa |
|
|
name: MedExQA |
|
|
metrics: |
|
|
- name: accuracy |
|
|
type: accuracy |
|
|
value: 81.5 |
|
|
verified: false |
|
|
- task: |
|
|
type: text-generation |
|
|
dataset: |
|
|
type: explaincpe |
|
|
name: ExplainCPE |
|
|
metrics: |
|
|
- name: accuracy |
|
|
type: accuracy |
|
|
value: 89.5 |
|
|
verified: false |
|
|
- task: |
|
|
type: text-generation |
|
|
dataset: |
|
|
type: mmlu_med |
|
|
name: MMLU-Med |
|
|
metrics: |
|
|
- name: accuracy |
|
|
type: accuracy |
|
|
value: 87.4 |
|
|
verified: false |
|
|
- task: |
|
|
type: text-generation |
|
|
dataset: |
|
|
type: medxperqa |
|
|
name: MedXperQA |
|
|
metrics: |
|
|
- name: accuracy |
|
|
type: accuracy |
|
|
value: 20.7 |
|
|
verified: false |
|
|
- task: |
|
|
type: text-generation |
|
|
dataset: |
|
|
type: anesbench |
|
|
name: AnesBench |
|
|
metrics: |
|
|
- name: accuracy |
|
|
type: accuracy |
|
|
value: 53.1 |
|
|
verified: false |
|
|
- task: |
|
|
type: text-generation |
|
|
dataset: |
|
|
type: diagnosisarena |
|
|
name: DiagnosisArena |
|
|
metrics: |
|
|
- name: accuracy |
|
|
type: accuracy |
|
|
value: 64.4 |
|
|
verified: false |
|
|
- task: |
|
|
type: text-generation |
|
|
dataset: |
|
|
type: clinbench_hbp |
|
|
name: Clinbench-HBP |
|
|
metrics: |
|
|
- name: accuracy |
|
|
type: accuracy |
|
|
value: 80.6 |
|
|
verified: false |
|
|
- task: |
|
|
type: text-generation |
|
|
dataset: |
|
|
type: medpair |
|
|
name: MedPAIR |
|
|
metrics: |
|
|
- name: accuracy |
|
|
type: accuracy |
|
|
value: 32.3 |
|
|
verified: false |
|
|
- task: |
|
|
type: text-generation |
|
|
dataset: |
|
|
type: amqa |
|
|
name: AMQA |
|
|
metrics: |
|
|
- name: accuracy |
|
|
type: accuracy |
|
|
value: 72.7 |
|
|
verified: false |
|
|
- task: |
|
|
type: text-generation |
|
|
dataset: |
|
|
type: medethicaleval |
|
|
name: MedethicalEval |
|
|
metrics: |
|
|
- name: accuracy |
|
|
type: accuracy |
|
|
value: 92.2 |
|
|
verified: false |
|
|
--- |
|
|
|
|
|
# MedGo: Medical Large Language Model Based on Qwen3-32B |
|
|
|
|
|
<div align="center"> |
|
|
|
|
|
[](https://huggingface.co/OpenMedZoo/MedGo) |
|
|
[](LICENSE) |
|
|
[](https://www.python.org/) |
|
|
|
|
|
|
|
|
English | [简体中文](./README_CN.md) |
|
|
|
|
|
</div> |
|
|
|
|
|
## 📋 Table of Contents |
|
|
|
|
|
- [Introduction](#introduction) |
|
|
- [Key Features](#key-features) |
|
|
- [Performance](#performance) |
|
|
- [Quick Start](#quick-start) |
|
|
- [Training Details](#training-details) |
|
|
- [Use Cases](#use-cases) |
|
|
- [Limitations & Risks](#limitations--risks) |
|
|
- [Citation](#citation) |
|
|
- [License](#license) |
|
|
- [Contributing](#contributing) |
|
|
- [Contact](#contact) |
|
|
|
|
|
## 🎯 Introduction |
|
|
|
|
|
**MedGo** is a general-purpose medical large language model fine-tuned from **Qwen3-32B**, designed for clinical medicine and research scenarios. The model is trained on large-scale multi-source medical corpora and enhanced with complex case data, supporting various capabilities including medical Q&A, clinical summary, clinical reasoning, multi-turn dialogue, and scientific text generation. |
|
|
|
|
|
### 🌟 Core Capabilities |
|
|
|
|
|
- **📚 Medical Knowledge Q&A**: Professional responses based on authoritative medical literature and clinical guidelines |
|
|
- **📝 Clinical Documentation**: Automated medical record summaries, diagnostic reports, and medical documentation |
|
|
- **🔍 Clinical Reasoning**: Differential diagnosis, examination recommendations, and treatment suggestions |
|
|
- **💬 Multi-turn Dialogue**: Patient-doctor interaction simulation and complex case discussions |
|
|
- **🔬 Research Support**: Literature summarization, research idea generation, and quality control review |
|
|
|
|
|
## ✨ Key Features |
|
|
|
|
|
| Feature | Details | |
|
|
|---------|---------| |
|
|
| **Base Architecture** | Qwen3-32B | |
|
|
| **Parameters** | 32B | |
|
|
| **Domain** | Clinical Medicine, Research Support, Healthcare System Integration | |
|
|
| **Fine-tuning Method** | SFT + Preference Alignment (DPO/KTO) | |
|
|
| **Data Sources** | Authoritative medical literature, clinical guidelines, real cases (anonymized) | |
|
|
| **Deployment** | Local deployment, HIS/EMR system integration | |
|
|
| **License** | Apache 2.0 | |
|
|
|
|
|
## 📊 Performance |
|
|
|
|
|
MedGo demonstrates excellent performance across multiple medical and general evaluation benchmarks, showing competitive results among 32B-parameter models: |
|
|
|
|
|
### Key Benchmark Results |
|
|
|
|
|
- **AIMedQA**: Medical question answering comprehension |
|
|
- **CME**: Clinical reasoning evaluation |
|
|
- **DiagnosisArena**: Diagnostic capability assessment |
|
|
- **MedQA / MedMCQA**: Medical multiple-choice questions |
|
|
- **PubMedQA**: Biomedical literature Q&A |
|
|
- **MMLU-Pro**: Comprehensive capability evaluation |
|
|
|
|
|
 |
|
|
|
|
|
**Performance Highlights**: |
|
|
- ✅ **Average Score**: ~70 points (excellent performance in the 32B parameter class) |
|
|
- ✅ **Strong Tasks**: Clinical reasoning (DiagnosisArena, CME) and multi-turn medical Q&A |
|
|
- ✅ **Balanced Capability**: Good performance in medical semantic understanding and multi-task generalization |
|
|
|
|
|
|
|
|
## 🚀 Quick Start |
|
|
|
|
|
### Requirements |
|
|
|
|
|
- Python >= 3.8 |
|
|
- PyTorch >= 2.0 |
|
|
- Transformers >= 4.35.0 |
|
|
- CUDA >= 11.8 (for GPU inference) |
|
|
|
|
|
### Installation |
|
|
|
|
|
```bash |
|
|
# Clone the repository |
|
|
git clone https://github.com/OpenMedZoo/MedGo.git |
|
|
cd MedGo |
|
|
|
|
|
# Install dependencies |
|
|
pip install -r requirements.txt |
|
|
``` |
|
|
|
|
|
### Model Download |
|
|
|
|
|
Download model weights from HuggingFace: |
|
|
|
|
|
```bash |
|
|
# Using huggingface-cli |
|
|
huggingface-cli download OpenMedZoo/MedGo --local-dir ./models/MedGo |
|
|
|
|
|
# Or using git-lfs |
|
|
git lfs install |
|
|
git clone https://huggingface.co/OpenMedZoo/MedGo |
|
|
``` |
|
|
|
|
|
### Basic Inference |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
|
|
# Load model and tokenizer |
|
|
model_path = "OpenMedZoo/MedGo" |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True) |
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
|
model_path, |
|
|
device_map="auto", |
|
|
trust_remote_code=True, |
|
|
torch_dtype="auto" |
|
|
) |
|
|
|
|
|
# Medical Q&A example |
|
|
messages = [ |
|
|
{"role": "system", "content": "You are a professional medical assistant. Please answer questions based on medical knowledge."}, |
|
|
{"role": "user", "content": "What is hypertension and what are the common treatment methods?"} |
|
|
] |
|
|
|
|
|
# Generate response |
|
|
inputs = tokenizer.apply_chat_template( |
|
|
messages, |
|
|
tokenize=True, |
|
|
add_generation_prompt=True, |
|
|
return_tensors="pt" |
|
|
).to(model.device) |
|
|
|
|
|
outputs = model.generate( |
|
|
inputs, |
|
|
max_new_tokens=512, |
|
|
temperature=0.7, |
|
|
top_p=0.9, |
|
|
do_sample=True |
|
|
) |
|
|
|
|
|
response = tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True) |
|
|
print(response) |
|
|
``` |
|
|
|
|
|
### Batch Inference |
|
|
|
|
|
```bash |
|
|
# Use the provided inference script |
|
|
python scripts/inference.py \ |
|
|
--model_path OpenMedZoo/MedGo \ |
|
|
--input_file examples/medical_qa.jsonl \ |
|
|
--output_file results/predictions.jsonl \ |
|
|
--batch_size 4 |
|
|
``` |
|
|
|
|
|
### Accelerated Inference with vLLM |
|
|
|
|
|
```python |
|
|
from vllm import LLM, SamplingParams |
|
|
|
|
|
# Initialize vLLM |
|
|
llm = LLM(model="OpenMedZoo/MedGo", trust_remote_code=True) |
|
|
sampling_params = SamplingParams(temperature=0.7, top_p=0.9, max_tokens=512) |
|
|
|
|
|
# Batch inference |
|
|
prompts = [ |
|
|
"What are the symptoms and treatment methods for diabetes?", |
|
|
"What dietary precautions should hypertensive patients take?" |
|
|
] |
|
|
|
|
|
outputs = llm.generate(prompts, sampling_params) |
|
|
for output in outputs: |
|
|
print(output.outputs[0].text) |
|
|
``` |
|
|
|
|
|
## 🔧 Training Details |
|
|
|
|
|
MedGo employs a **two-stage fine-tuning strategy** to balance general medical knowledge with clinical task adaptation. |
|
|
|
|
|
### Stage I: General Medical Alignment |
|
|
|
|
|
**Objective**: Establish a solid foundation of medical knowledge and improve Q&A standardization |
|
|
|
|
|
- **Data Sources**: |
|
|
- Authoritative medical literature (PubMed, medical textbooks) |
|
|
- Clinical guidelines and diagnostic standards |
|
|
- Medical encyclopedia entries and terminology databases |
|
|
|
|
|
- **Training Methods**: |
|
|
- Supervised Fine-Tuning (SFT) |
|
|
- Chain-of-Thought (CoT) guided samples |
|
|
- Medical terminology alignment and safety constraints |
|
|
|
|
|
### Stage II: Clinical Task Enhancement |
|
|
|
|
|
**Objective**: Enhance complex case reasoning and multi-task processing capabilities |
|
|
|
|
|
- **Data Sources**: |
|
|
- Real medical records (fully anonymized) |
|
|
- Outpatient and emergency records with complex multi-diagnosis samples |
|
|
- Research articles and quality control cases |
|
|
|
|
|
- **Data Augmentation Techniques**: |
|
|
- Semantic paraphrasing and multi-perspective expansion |
|
|
- Complex case synthesis |
|
|
- Doctor-patient interaction simulation |
|
|
|
|
|
- **Training Methods**: |
|
|
- Multi-Task Learning (medical record summary, differential diagnosis, examination suggestions, etc.) |
|
|
- Preference Alignment (DPO/KTO) |
|
|
- Expert feedback iterative optimization |
|
|
|
|
|
### Training Optimization Focus |
|
|
|
|
|
- ✅ Strengthen information extraction and cross-evidence reasoning for complex cases |
|
|
- ✅ Improve medical consistency and interpretability of outputs |
|
|
- ✅ Optimize expression compliance and safety |
|
|
- ✅ Continuous iteration through expert samples and automated evaluation |
|
|
|
|
|
## 💡 Use Cases |
|
|
|
|
|
### ✅ Suitable Scenarios |
|
|
|
|
|
| Scenario | Description | |
|
|
|----------|-------------| |
|
|
| **Clinical Assistance** | Preliminary diagnosis suggestions, medical record writing, formatted report generation | |
|
|
| **Research Support** | Literature summarization, research idea generation, data analysis assistance | |
|
|
| **Quality Control** | Medical document compliance checking, clinical process quality control | |
|
|
| **System Integration** | Embedded in HIS/EMR systems to provide intelligent decision support | |
|
|
| **Medical Education** | Case discussions, medical knowledge Q&A, clinical reasoning training | |
|
|
|
|
|
### 🚫 Unsuitable Scenarios |
|
|
|
|
|
- ❌ **Cannot Replace Doctors**: Only an auxiliary tool, not a standalone diagnostic basis |
|
|
- ❌ **High-Risk Operations**: Not recommended for surgical decisions or other high-risk medical operations |
|
|
- ❌ **Rare Disease Limitations**: May perform poorly on rare diseases outside training data |
|
|
- ❌ **Emergency Care**: Not suitable for scenarios requiring immediate decisions |
|
|
|
|
|
## ⚠️ Limitations & Risks |
|
|
|
|
|
### Model Limitations |
|
|
|
|
|
1. **Understanding Bias**: Despite covering extensive medical knowledge, may still produce understanding biases or incorrect recommendations |
|
|
2. **Complex Cases**: Higher risk for cases with complex conditions, severe complications, or missing information |
|
|
3. **Knowledge Currency**: Medical knowledge continuously updates; training data may lag |
|
|
4. **Language Limitation**: Primarily designed for Chinese medical scenarios; performance in other languages may vary |
|
|
|
|
|
### Usage Recommendations |
|
|
|
|
|
- ⚠️ Use in controlled environments with clinical expert review of generated results |
|
|
- ⚠️ Treat model outputs as auxiliary references, not final diagnostic conclusions |
|
|
- ⚠️ For sensitive cases or high-risk scenarios, expert consultation is mandatory |
|
|
- ⚠️ Deployment requires internal validation, security review, and clinical testing |
|
|
|
|
|
### Data Privacy & Compliance |
|
|
|
|
|
- 🔒 Training data fully anonymized |
|
|
- 🔒 Attention to patient privacy protection during use |
|
|
- 🔒 Production deployment must comply with healthcare data security regulations (e.g., HIPAA, GDPR) |
|
|
- 🔒 Local deployment recommended to avoid sensitive data transmission |
|
|
|
|
|
## 📚 Citation |
|
|
|
|
|
If MedGo is helpful for your research or project, please cite our work: |
|
|
|
|
|
```bibtex |
|
|
@misc{openmedzoo_2025, |
|
|
author = { OpenMedZoo }, |
|
|
title = { MedGo (Revision 640a2e2) }, |
|
|
year = 2025, |
|
|
url = { https://huggingface.co/OpenMedZoo/MedGo }, |
|
|
doi = { 10.57967/hf/7024 }, |
|
|
publisher = { Hugging Face } |
|
|
} |
|
|
``` |
|
|
|
|
|
## 📄 License |
|
|
|
|
|
This project is licensed under the [Apache License 2.0](LICENSE). |
|
|
|
|
|
**Commercial Use Notice**: |
|
|
- ✅ Commercial use and modification allowed |
|
|
- ✅ Original license and copyright notice must be retained |
|
|
- ✅ Contact us for technical support when integrating into healthcare systems |
|
|
|
|
|
## 🤝 Contributing |
|
|
|
|
|
We welcome community contributions! Here's how to participate: |
|
|
|
|
|
### Contribution Types |
|
|
|
|
|
- 🐛 Submit bug reports |
|
|
- 💡 Propose new features |
|
|
- 📝 Improve documentation |
|
|
- 🔧 Submit code fixes or optimizations |
|
|
- 📊 Share evaluation results and use cases |
|
|
|
|
|
|
|
|
## 🙏 Acknowledgments |
|
|
|
|
|
Thanks to all contributors to the MedGo project: |
|
|
|
|
|
- Model development and fine-tuning algorithm team |
|
|
- Data annotation and quality control team |
|
|
- Clinical expert guidance and review team |
|
|
- Open-source community support and feedback |
|
|
|
|
|
Special thanks to: |
|
|
- [Qwen Team](https://github.com/QwenLM/Qwen) for providing excellent foundation models |
|
|
- All healthcare institutions that provided data and feedback |
|
|
|
|
|
## 📧 Contact |
|
|
|
|
|
- **HuggingFace**: [Model Homepage](https://huggingface.co/OpenMedZoo/MedGo) |
|
|
|
|
|
## Copyright |
|
|
- Publisher: Tongji University Affiliated East Hospital — Sole Corresponding Author |
|
|
- Co-developer / Technical Support: Shanghai Shuole Technology Co., Ltd. |
|
|
- Contact: dongfyy@pudong.gov.cn |
|
|
- Version: v1.0 |
|
|
- Attribution (required): |
|
|
“Powered by Med-Go 32B, released by Tongji University Affiliated East Hospital (v1.0).” |
|
|
|
|
|
--- |
|
|
|
|
|
<div align="center"> |
|
|
</div> |