--- license: apache-2.0 language: - zh - en metrics: - accuracy base_model: - Qwen/Qwen3-32B pipeline_tag: text-generation library_name: transformers tags: - medical model-index: - name: Med-Go-32B results: - task: type: text-generation dataset: type: medical_eval_hle name: Medical-Eval-HLE metrics: - name: accuracy type: accuracy value: 19.4 verified: false - task: type: text-generation dataset: type: supergpqa name: SuperGPQA metrics: - name: accuracy type: accuracy value: 37.2 verified: false - task: type: text-generation dataset: type: medbullets name: Medbullets metrics: - name: accuracy type: accuracy value: 57.8 verified: false - task: type: text-generation dataset: type: mmlu_pro name: MMLU-pro metrics: - name: accuracy type: accuracy value: 64.3 verified: false - task: type: text-generation dataset: type: afrimedqa name: AfrimedQA metrics: - name: accuracy type: accuracy value: 74.7 verified: false - task: type: text-generation dataset: type: medmcqa name: MedMCQA metrics: - name: accuracy type: accuracy value: 68.3 verified: false - task: type: text-generation dataset: type: medqa_usmle name: MedQA-USMLE metrics: - name: accuracy type: accuracy value: 76.8 verified: false - task: type: text-generation dataset: type: cmb name: CMB metrics: - name: accuracy type: accuracy value: 92.5 verified: false - task: type: text-generation dataset: type: cmexam name: CMExam metrics: - name: accuracy type: accuracy value: 87.4 verified: false - task: type: text-generation dataset: type: pubmedqa name: PubMedQA metrics: - name: accuracy type: accuracy value: 76.6 verified: false - task: type: text-generation dataset: type: medexqa name: MedExQA metrics: - name: accuracy type: accuracy value: 81.5 verified: false - task: type: text-generation dataset: type: explaincpe name: ExplainCPE metrics: - name: accuracy type: accuracy value: 89.5 verified: false - task: type: text-generation dataset: type: mmlu_med name: MMLU-Med metrics: - name: accuracy type: accuracy value: 87.4 verified: false - task: type: text-generation dataset: type: medxperqa name: MedXperQA metrics: - name: accuracy type: accuracy value: 20.7 verified: false - task: type: text-generation dataset: type: anesbench name: AnesBench metrics: - name: accuracy type: accuracy value: 53.1 verified: false - task: type: text-generation dataset: type: diagnosisarena name: DiagnosisArena metrics: - name: accuracy type: accuracy value: 64.4 verified: false - task: type: text-generation dataset: type: clinbench_hbp name: Clinbench-HBP metrics: - name: accuracy type: accuracy value: 80.6 verified: false - task: type: text-generation dataset: type: medpair name: MedPAIR metrics: - name: accuracy type: accuracy value: 32.3 verified: false - task: type: text-generation dataset: type: amqa name: AMQA metrics: - name: accuracy type: accuracy value: 72.7 verified: false - task: type: text-generation dataset: type: medethicaleval name: MedethicalEval metrics: - name: accuracy type: accuracy value: 92.2 verified: false --- # MedGo: Medical Large Language Model Based on Qwen3-32B
[![HuggingFace](https://img.shields.io/badge/%F0%9F%A4%97%20HuggingFace-Model-yellow)](https://huggingface.co/OpenMedZoo/MedGo) [![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](LICENSE) [![Python](https://img.shields.io/badge/Python-3.8+-blue.svg)](https://www.python.org/) English | [็ฎ€ไฝ“ไธญๆ–‡](./README_CN.md)
## ๐Ÿ“‹ Table of Contents - [Introduction](#introduction) - [Key Features](#key-features) - [Performance](#performance) - [Quick Start](#quick-start) - [Training Details](#training-details) - [Use Cases](#use-cases) - [Limitations & Risks](#limitations--risks) - [Citation](#citation) - [License](#license) - [Contributing](#contributing) - [Contact](#contact) ## ๐ŸŽฏ Introduction **MedGo** is a general-purpose medical large language model fine-tuned from **Qwen3-32B**, designed for clinical medicine and research scenarios. The model is trained on large-scale multi-source medical corpora and enhanced with complex case data, supporting various capabilities including medical Q&A, clinical summary, clinical reasoning, multi-turn dialogue, and scientific text generation. ### ๐ŸŒŸ Core Capabilities - **๐Ÿ“š Medical Knowledge Q&A**: Professional responses based on authoritative medical literature and clinical guidelines - **๐Ÿ“ Clinical Documentation**: Automated medical record summaries, diagnostic reports, and medical documentation - **๐Ÿ” Clinical Reasoning**: Differential diagnosis, examination recommendations, and treatment suggestions - **๐Ÿ’ฌ Multi-turn Dialogue**: Patient-doctor interaction simulation and complex case discussions - **๐Ÿ”ฌ Research Support**: Literature summarization, research idea generation, and quality control review ## โœจ Key Features | Feature | Details | |---------|---------| | **Base Architecture** | Qwen3-32B | | **Parameters** | 32B | | **Domain** | Clinical Medicine, Research Support, Healthcare System Integration | | **Fine-tuning Method** | SFT + Preference Alignment (DPO/KTO) | | **Data Sources** | Authoritative medical literature, clinical guidelines, real cases (anonymized) | | **Deployment** | Local deployment, HIS/EMR system integration | | **License** | Apache 2.0 | ## ๐Ÿ“Š Performance MedGo demonstrates excellent performance across multiple medical and general evaluation benchmarks, showing competitive results among 32B-parameter models: ### Key Benchmark Results - **AIMedQA**: Medical question answering comprehension - **CME**: Clinical reasoning evaluation - **DiagnosisArena**: Diagnostic capability assessment - **MedQA / MedMCQA**: Medical multiple-choice questions - **PubMedQA**: Biomedical literature Q&A - **MMLU-Pro**: Comprehensive capability evaluation ![Performance Comparison](./main_results.png) **Performance Highlights**: - โœ… **Average Score**: ~70 points (excellent performance in the 32B parameter class) - โœ… **Strong Tasks**: Clinical reasoning (DiagnosisArena, CME) and multi-turn medical Q&A - โœ… **Balanced Capability**: Good performance in medical semantic understanding and multi-task generalization ## ๐Ÿš€ Quick Start ### Requirements - Python >= 3.8 - PyTorch >= 2.0 - Transformers >= 4.35.0 - CUDA >= 11.8 (for GPU inference) ### Installation ```bash # Clone the repository git clone https://github.com/OpenMedZoo/MedGo.git cd MedGo # Install dependencies pip install -r requirements.txt ``` ### Model Download Download model weights from HuggingFace: ```bash # Using huggingface-cli huggingface-cli download OpenMedZoo/MedGo --local-dir ./models/MedGo # Or using git-lfs git lfs install git clone https://huggingface.co/OpenMedZoo/MedGo ``` ### Basic Inference ```python from transformers import AutoModelForCausalLM, AutoTokenizer # Load model and tokenizer model_path = "OpenMedZoo/MedGo" tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained( model_path, device_map="auto", trust_remote_code=True, torch_dtype="auto" ) # Medical Q&A example messages = [ {"role": "system", "content": "You are a professional medical assistant. Please answer questions based on medical knowledge."}, {"role": "user", "content": "What is hypertension and what are the common treatment methods?"} ] # Generate response inputs = tokenizer.apply_chat_template( messages, tokenize=True, add_generation_prompt=True, return_tensors="pt" ).to(model.device) outputs = model.generate( inputs, max_new_tokens=512, temperature=0.7, top_p=0.9, do_sample=True ) response = tokenizer.decode(outputs[0][len(inputs[0]):], skip_special_tokens=True) print(response) ``` ### Batch Inference ```bash # Use the provided inference script python scripts/inference.py \ --model_path OpenMedZoo/MedGo \ --input_file examples/medical_qa.jsonl \ --output_file results/predictions.jsonl \ --batch_size 4 ``` ### Accelerated Inference with vLLM ```python from vllm import LLM, SamplingParams # Initialize vLLM llm = LLM(model="OpenMedZoo/MedGo", trust_remote_code=True) sampling_params = SamplingParams(temperature=0.7, top_p=0.9, max_tokens=512) # Batch inference prompts = [ "What are the symptoms and treatment methods for diabetes?", "What dietary precautions should hypertensive patients take?" ] outputs = llm.generate(prompts, sampling_params) for output in outputs: print(output.outputs[0].text) ``` ## ๐Ÿ”ง Training Details MedGo employs a **two-stage fine-tuning strategy** to balance general medical knowledge with clinical task adaptation. ### Stage I: General Medical Alignment **Objective**: Establish a solid foundation of medical knowledge and improve Q&A standardization - **Data Sources**: - Authoritative medical literature (PubMed, medical textbooks) - Clinical guidelines and diagnostic standards - Medical encyclopedia entries and terminology databases - **Training Methods**: - Supervised Fine-Tuning (SFT) - Chain-of-Thought (CoT) guided samples - Medical terminology alignment and safety constraints ### Stage II: Clinical Task Enhancement **Objective**: Enhance complex case reasoning and multi-task processing capabilities - **Data Sources**: - Real medical records (fully anonymized) - Outpatient and emergency records with complex multi-diagnosis samples - Research articles and quality control cases - **Data Augmentation Techniques**: - Semantic paraphrasing and multi-perspective expansion - Complex case synthesis - Doctor-patient interaction simulation - **Training Methods**: - Multi-Task Learning (medical record summary, differential diagnosis, examination suggestions, etc.) - Preference Alignment (DPO/KTO) - Expert feedback iterative optimization ### Training Optimization Focus - โœ… Strengthen information extraction and cross-evidence reasoning for complex cases - โœ… Improve medical consistency and interpretability of outputs - โœ… Optimize expression compliance and safety - โœ… Continuous iteration through expert samples and automated evaluation ## ๐Ÿ’ก Use Cases ### โœ… Suitable Scenarios | Scenario | Description | |----------|-------------| | **Clinical Assistance** | Preliminary diagnosis suggestions, medical record writing, formatted report generation | | **Research Support** | Literature summarization, research idea generation, data analysis assistance | | **Quality Control** | Medical document compliance checking, clinical process quality control | | **System Integration** | Embedded in HIS/EMR systems to provide intelligent decision support | | **Medical Education** | Case discussions, medical knowledge Q&A, clinical reasoning training | ### ๐Ÿšซ Unsuitable Scenarios - โŒ **Cannot Replace Doctors**: Only an auxiliary tool, not a standalone diagnostic basis - โŒ **High-Risk Operations**: Not recommended for surgical decisions or other high-risk medical operations - โŒ **Rare Disease Limitations**: May perform poorly on rare diseases outside training data - โŒ **Emergency Care**: Not suitable for scenarios requiring immediate decisions ## โš ๏ธ Limitations & Risks ### Model Limitations 1. **Understanding Bias**: Despite covering extensive medical knowledge, may still produce understanding biases or incorrect recommendations 2. **Complex Cases**: Higher risk for cases with complex conditions, severe complications, or missing information 3. **Knowledge Currency**: Medical knowledge continuously updates; training data may lag 4. **Language Limitation**: Primarily designed for Chinese medical scenarios; performance in other languages may vary ### Usage Recommendations - โš ๏ธ Use in controlled environments with clinical expert review of generated results - โš ๏ธ Treat model outputs as auxiliary references, not final diagnostic conclusions - โš ๏ธ For sensitive cases or high-risk scenarios, expert consultation is mandatory - โš ๏ธ Deployment requires internal validation, security review, and clinical testing ### Data Privacy & Compliance - ๐Ÿ”’ Training data fully anonymized - ๐Ÿ”’ Attention to patient privacy protection during use - ๐Ÿ”’ Production deployment must comply with healthcare data security regulations (e.g., HIPAA, GDPR) - ๐Ÿ”’ Local deployment recommended to avoid sensitive data transmission ## ๐Ÿ“š Citation If MedGo is helpful for your research or project, please cite our work: ```bibtex @misc{openmedzoo_2025, author = { OpenMedZoo }, title = { MedGo (Revision 640a2e2) }, year = 2025, url = { https://huggingface.co/OpenMedZoo/MedGo }, doi = { 10.57967/hf/7024 }, publisher = { Hugging Face } } ``` ## ๐Ÿ“„ License This project is licensed under the [Apache License 2.0](LICENSE). **Commercial Use Notice**: - โœ… Commercial use and modification allowed - โœ… Original license and copyright notice must be retained - โœ… Contact us for technical support when integrating into healthcare systems ## ๐Ÿค Contributing We welcome community contributions! Here's how to participate: ### Contribution Types - ๐Ÿ› Submit bug reports - ๐Ÿ’ก Propose new features - ๐Ÿ“ Improve documentation - ๐Ÿ”ง Submit code fixes or optimizations - ๐Ÿ“Š Share evaluation results and use cases ## ๐Ÿ™ Acknowledgments Thanks to all contributors to the MedGo project: - Model development and fine-tuning algorithm team - Data annotation and quality control team - Clinical expert guidance and review team - Open-source community support and feedback Special thanks to: - [Qwen Team](https://github.com/QwenLM/Qwen) for providing excellent foundation models - All healthcare institutions that provided data and feedback ## ๐Ÿ“ง Contact - **HuggingFace**: [Model Homepage](https://huggingface.co/OpenMedZoo/MedGo) ## Copyright - Publisher: Tongji University Affiliated East Hospital โ€” Sole Corresponding Author - Co-developer / Technical Support: Shanghai Shuole Technology Co., Ltd. - Contact: dongfyy@pudong.gov.cn - Version: v1.0 - Attribution (required): โ€œPowered by Med-Go 32B, released by Tongji University Affiliated East Hospital (v1.0).โ€ ---