Create README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,214 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# SafeMed-R1: A Trustworthy Medical Reasoning Model
|
| 2 |
+
|
| 3 |
+
🔍 **Overview**
|
| 4 |
+

|
| 5 |
+
|
| 6 |
+
Modern medical LLMs can “get the answer right” on exams, but still fail to earn clinical trust:
|
| 7 |
+
|
| 8 |
+
- They often lack transparent reasoning chains that clinicians can audit.
|
| 9 |
+
- Their behavior may drift from medical ethics and regulatory requirements.
|
| 10 |
+
- They are vulnerable to induction attacks (jailbreaks, harmful advice, unethical suggestions).
|
| 11 |
+
|
| 12 |
+
---
|
| 13 |
+
|
| 14 |
+
✨ **Highlights**
|
| 15 |
+
|
| 16 |
+
**Trustworthy and safe output**
|
| 17 |
+
This model adheres to medical ethics and safety principles. It has been meticulously fine-tuned to avoid prohibited content and harmful advice, delivering useful, fact-based answers alongside appropriate disclaimers. Fine-tuning based on safety data (such as MedSafetyBench) has significantly enhanced the model's safety while maintaining its performance.
|
| 18 |
+
|
| 19 |
+
**Attack resistance**
|
| 20 |
+
SafeMed-R1 has been trained to resist common jailbreak attacks and adversarial requests that could otherwise bypass security filters. Through multi-dimensional reward optimisation, it has learned to reject or safely handle malicious queries. (Recent research indicates that standard LLMs remain vulnerable to simple multi-step jailbreak attacks.)
|
| 21 |
+
|
| 22 |
+
**Explainable reasoning**
|
| 23 |
+
Benefiting from chain-of-thought training, this model can progressively explain its medical reasoning process when prompted. This transparency assists users and clinicians in understanding the logic behind answers, thereby enhancing trust in the recommended outcomes. SafeMed-R1 leverages this capability to internalise expert-reviewed medical question reasoning pathways.
|
| 24 |
+
|
| 25 |
+
---
|
| 26 |
+
|
| 27 |
+
## ⚡ Introduction
|
| 28 |
+
|
| 29 |
+
The model underwent a rigorous two-stage training process:
|
| 30 |
+
|
| 31 |
+
1. **Supervised fine-tuning (SFT)** using expert-validated chain-of-thought (CoT) medical reasoning data alongside an ethics and safety guidance dataset (encompassing diverse medical ethical and legal risks).
|
| 32 |
+
2. **Reinforcement learning (RL)** to align the model's outputs with medical ethical guidelines and enhance its robustness against adversarial inputs.
|
| 33 |
+
|
| 34 |
+
The resulting medical LLM demonstrates strong performance in clinical tasks while minimising the risk of unsafe or unreliable outputs.
|
| 35 |
+
|
| 36 |
+
---
|
| 37 |
+
|
| 38 |
+
## ✨ Key Features
|
| 39 |
+
|
| 40 |
+
### 1. Strong Clinical Reasoning
|
| 41 |
+
|
| 42 |
+
- Achieves SOTA or competitive performance on multiple Chinese medical QA benchmarks (e.g., CMExam, MedQA, Chinese-Exam).
|
| 43 |
+
- Produces structured, step-by-step clinical reasoning aligned with real diagnostic thinking.
|
| 44 |
+
|
| 45 |
+
### 2. Ethics & Safety Alignment
|
| 46 |
+
|
| 47 |
+
Trained on rich medical ethics and patient safety corpora, covering:
|
| 48 |
+
|
| 49 |
+
- Clinical treatment ethics
|
| 50 |
+
- Research ethics
|
| 51 |
+
- End-of-life care
|
| 52 |
+
- Public health ethics
|
| 53 |
+
- Doctor–patient relationship ethics
|
| 54 |
+
- Infection control, medication safety, incident reporting, legal compliance, etc.
|
| 55 |
+
|
| 56 |
+
Evaluated on MedSafety and MedEthics benchmarks, SafeMed-R1 achieves leading performance across sub-dimensions.
|
| 57 |
+
|
| 58 |
+
### 3. 🔐 Attack-Resistant Medical Alignment
|
| 59 |
+
|
| 60 |
+
Reinforcement learning is performed using red-team attack scenarios tailored to healthcare, including:
|
| 61 |
+
|
| 62 |
+
- Dangerous treatment requests
|
| 63 |
+
- Requests to bypass standard-of-care
|
| 64 |
+
- Misuse of medical devices, drugs, or emergency procedures
|
| 65 |
+
|
| 66 |
+
SafeMed-R1 shows top performance on **Med-Safety-Bench**, evaluated against strong models including GPT-4o, DeepSeek-V3, Qwen3-235B-A22B, etc. It is capable of “saying no” correctly and explaining why, with guideline- and regulation-based reasoning.
|
| 67 |
+
|
| 68 |
+
---
|
| 69 |
+
|
| 70 |
+
## How to Use (Transformers)
|
| 71 |
+
|
| 72 |
+
```python
|
| 73 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
| 74 |
+
import torch
|
| 75 |
+
|
| 76 |
+
model_id = "OpenMedZoo/SafeMed-R1"
|
| 77 |
+
tok = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
|
| 78 |
+
model = AutoModelForCausalLM.from_pretrained(
|
| 79 |
+
model_id,
|
| 80 |
+
torch_dtype=torch.bfloat16,
|
| 81 |
+
device_map="auto",
|
| 82 |
+
trust_remote_code=True,
|
| 83 |
+
)
|
| 84 |
+
|
| 85 |
+
messages = [
|
| 86 |
+
{"role": "system", "content": "You are SafeMed-R1, a cautious medical assistant. Provide safe, factual info and disclaimers. Do not provide diagnosis or treatment instructions."},
|
| 87 |
+
{"role": "user", "content": "我最近总是头痛,应该怎么办?"},
|
| 88 |
+
]
|
| 89 |
+
prompt = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
|
| 90 |
+
|
| 91 |
+
outputs = model.generate(
|
| 92 |
+
**tok(prompt, return_tensors="pt").to(model.device),
|
| 93 |
+
max_new_tokens=512,
|
| 94 |
+
temperature=0.6,
|
| 95 |
+
top_p=0.9,
|
| 96 |
+
do_sample=True,
|
| 97 |
+
)
|
| 98 |
+
print(tok.decode(outputs[0], skip_special_tokens=True))
|
| 99 |
+
```
|
| 100 |
+
|
| 101 |
+
Tips:
|
| 102 |
+
- Include a safety-focused system prompt (ethics, disclaimers, refusal policy)
|
| 103 |
+
- For reasoning, prefer short, verifiable rationale summaries in production
|
| 104 |
+
|
| 105 |
+
|
| 106 |
+
## 🚀 Install & Deploy (vLLM)
|
| 107 |
+
|
| 108 |
+
### Step 1 — Install vLLM
|
| 109 |
+
|
| 110 |
+
Follow the official vLLM installation guide to select the version compatible with your environment (GPU/CPU/TPU, CUDA/ROCm, PyTorch):
|
| 111 |
+
|
| 112 |
+
- Docs: <https://docs.vllm.ai/en/latest/>
|
| 113 |
+
|
| 114 |
+
### Step 2 — Download Weights (Hugging Face)
|
| 115 |
+
|
| 116 |
+
Pull the model weights from Hugging Face. You can reference the repo directly in vLLM, or pre-download locally:
|
| 117 |
+
|
| 118 |
+
- Hugging Face: <https://huggingface.co/OpenMedZoo/SafeMed-R1/tree/main>
|
| 119 |
+
|
| 120 |
+
**Option A (recommended): Use the Hugging Face repo name directly in vLLM**
|
| 121 |
+
|
| 122 |
+
```bash
|
| 123 |
+
--model OpenMedZoo/SafeMed-R1
|
| 124 |
+
```
|
| 125 |
+
|
| 126 |
+
**Option B: Pre-download locally (example)**
|
| 127 |
+
|
| 128 |
+
```bash
|
| 129 |
+
pip install -U "huggingface_hub[cli]"
|
| 130 |
+
|
| 131 |
+
huggingface-cli download OpenMedZoo/SafeMed-R1 \
|
| 132 |
+
--local-dir ./models/SafeMed-R1
|
| 133 |
+
```
|
| 134 |
+
|
| 135 |
+
Then set:
|
| 136 |
+
|
| 137 |
+
```bash
|
| 138 |
+
--model ./models/SafeMed-R1
|
| 139 |
+
```
|
| 140 |
+
|
| 141 |
+
### Step 3 — Deploy (OpenAI-Compatible API Server)
|
| 142 |
+
|
| 143 |
+
Minimal vLLM `serve` example:
|
| 144 |
+
|
| 145 |
+
```bash
|
| 146 |
+
MODEL_PATH="${1:-./models/SafeMed-R1}"
|
| 147 |
+
TENSOR_PARALLEL_SIZE="${2:-4}"
|
| 148 |
+
PIPELINE_PARALLEL_SIZE="${3:-1}"
|
| 149 |
+
PORT=50050
|
| 150 |
+
|
| 151 |
+
vllm serve "$MODEL_PATH" \
|
| 152 |
+
--host 0.0.0.0 \
|
| 153 |
+
--port "$PORT" \
|
| 154 |
+
--trust-remote-code \
|
| 155 |
+
--served-model-name "safemed-r1" \
|
| 156 |
+
--tensor-parallel-size "$TENSOR_PARALLEL_SIZE" \
|
| 157 |
+
--pipeline-parallel-size "$PIPELINE_PARALLEL_SIZE" \
|
| 158 |
+
--gpu-memory-utilization 0.9 \
|
| 159 |
+
--disable-sliding-window \
|
| 160 |
+
--max-model-len 4096 \
|
| 161 |
+
--enable-prefix-caching
|
| 162 |
+
```
|
| 163 |
+
|
| 164 |
+
For additional parameters (multi-GPU, performance tuning, quantization, server arguments), see:
|
| 165 |
+
|
| 166 |
+
- <https://docs.vllm.ai/en/latest/>
|
| 167 |
+
|
| 168 |
+
---
|
| 169 |
+
|
| 170 |
+
## 📊 Evaluation
|
| 171 |
+

|
| 172 |
+
SafeMed-R1 surpasses its base and size-matched LLMs on both medical knowledge QA and safety/ethics, combining strong accuracy with robust safety alignment (see the associated radar plot in our paper/report).
|
| 173 |
+

|
| 174 |
+
- **Knowledge QA and exams**: SafeMed-R1 leads or is competitive on MedQA-CN/TW, CMExam (val/test), PediaBench (MC), and CE-Phys/Pharm/Nurse, producing structured, step-by-step clinical reasoning (see Table 1 in the paper).
|
| 175 |
+

|
| 176 |
+
- **Safety and ethics**: SafeMed-R1 attains top or near-top results on MedSafety and MedEthics. On Med-Safety-Bench, SafeMed-R1 achieves consistently lower risk scores across evaluators (GPT-4o, DeepSeek-V3, Qwen3-235B-A22B), indicating stronger resistance to jailbreak/induction attacks and yielding a strong Overall-Average score (see Table 2).
|
| 177 |
+
|
| 178 |
+
---
|
| 179 |
+
|
| 180 |
+
## 🧪 Case Studies
|
| 181 |
+

|
| 182 |
+
|
| 183 |
+
In diverse safety-critical scenarios (e.g., biased/unsafe treatment requests, unnecessary invasive procedures), SafeMed-R1:
|
| 184 |
+
|
| 185 |
+
- Reliably flags risk
|
| 186 |
+
- Declines harmful actions
|
| 187 |
+
- Offers guideline-aligned, conservative alternatives with clear patient counseling
|
| 188 |
+
|
| 189 |
+
This demonstrates practical safety, explainability, and adherence to clinical standards.
|
| 190 |
+
|
| 191 |
+
---
|
| 192 |
+
|
| 193 |
+
## 🔒 Safety, Privacy & Compliance
|
| 194 |
+
|
| 195 |
+
- Data anonymisation, authorisation protocols, and restrictions on secondary distribution
|
| 196 |
+
- Adherence to local laws, regulations, and medical ethics guidelines
|
| 197 |
+
|
| 198 |
+
**Disclaimer (English):**
|
| 199 |
+
This project is intended solely for research and educational purposes and must not be used as a substitute for professional medical advice or diagnosis.
|
| 200 |
+
|
| 201 |
+
**免责声明(中文):**
|
| 202 |
+
隐私与合规:数据脱敏、授权与二次分发限制;遵循本地法律法规与医学伦理准则。
|
| 203 |
+
本项目仅供科研与教育,不得替代专业医疗建议或诊疗。
|
| 204 |
+
|
| 205 |
+
---
|
| 206 |
+
|
| 207 |
+
## 🙏 Acknowledgements
|
| 208 |
+
|
| 209 |
+
We thank the open-source community for toolchains and benchmarks, especially:
|
| 210 |
+
|
| 211 |
+
- **LLaMA-Factory**: <https://github.com/hiyouga/LLaMA-Factory>
|
| 212 |
+
- **MS-Swift**: <https://github.com/microsoft/MS-Swift>
|
| 213 |
+
|
| 214 |
+
These projects greatly accelerated the construction and implementation of SafeMed-R1.
|