Anony-mous commited on
Commit
65762fa
·
verified ·
1 Parent(s): 085688f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +28 -175
README.md CHANGED
@@ -1,214 +1,67 @@
1
  # SafeMed-R1: A Trustworthy Medical Reasoning Model
2
 
3
- 🔍 **Overview**
4
- ![SafeMed-R1 Overview](https://github.com/OpenMedZoo/SafeMed-R1/safemed-r1-safety-overview.png)
 
 
5
 
6
- Modern medical LLMs can “get the answer right” on exams, but still fail to earn clinical trust:
7
 
8
- - They often lack transparent reasoning chains that clinicians can audit.
9
- - Their behavior may drift from medical ethics and regulatory requirements.
10
- - They are vulnerable to induction attacks (jailbreaks, harmful advice, unethical suggestions).
11
 
12
- ---
13
-
14
- ✨ **Highlights**
15
-
16
- **Trustworthy and safe output**
17
- This model adheres to medical ethics and safety principles. It has been meticulously fine-tuned to avoid prohibited content and harmful advice, delivering useful, fact-based answers alongside appropriate disclaimers. Fine-tuning based on safety data (such as MedSafetyBench) has significantly enhanced the model's safety while maintaining its performance.
18
-
19
- **Attack resistance**
20
- SafeMed-R1 has been trained to resist common jailbreak attacks and adversarial requests that could otherwise bypass security filters. Through multi-dimensional reward optimisation, it has learned to reject or safely handle malicious queries. (Recent research indicates that standard LLMs remain vulnerable to simple multi-step jailbreak attacks.)
21
-
22
- **Explainable reasoning**
23
- Benefiting from chain-of-thought training, this model can progressively explain its medical reasoning process when prompted. This transparency assists users and clinicians in understanding the logic behind answers, thereby enhancing trust in the recommended outcomes. SafeMed-R1 leverages this capability to internalise expert-reviewed medical question reasoning pathways.
24
-
25
- ---
26
-
27
- ## ⚡ Introduction
28
 
29
- The model underwent a rigorous two-stage training process:
30
 
31
- 1. **Supervised fine-tuning (SFT)** using expert-validated chain-of-thought (CoT) medical reasoning data alongside an ethics and safety guidance dataset (encompassing diverse medical ethical and legal risks).
32
- 2. **Reinforcement learning (RL)** to align the model's outputs with medical ethical guidelines and enhance its robustness against adversarial inputs.
 
33
 
34
- The resulting medical LLM demonstrates strong performance in clinical tasks while minimising the risk of unsafe or unreliable outputs.
 
35
 
36
  ---
37
 
38
- ## ✨ Key Features
39
-
40
- ### 1. Strong Clinical Reasoning
41
-
42
- - Achieves SOTA or competitive performance on multiple Chinese medical QA benchmarks (e.g., CMExam, MedQA, Chinese-Exam).
43
- - Produces structured, step-by-step clinical reasoning aligned with real diagnostic thinking.
44
-
45
- ### 2. Ethics & Safety Alignment
46
-
47
- Trained on rich medical ethics and patient safety corpora, covering:
48
-
49
- - Clinical treatment ethics
50
- - Research ethics
51
- - End-of-life care
52
- - Public health ethics
53
- - Doctor–patient relationship ethics
54
- - Infection control, medication safety, incident reporting, legal compliance, etc.
55
-
56
- Evaluated on MedSafety and MedEthics benchmarks, SafeMed-R1 achieves leading performance across sub-dimensions.
57
-
58
- ### 3. 🔐 Attack-Resistant Medical Alignment
59
-
60
- Reinforcement learning is performed using red-team attack scenarios tailored to healthcare, including:
61
 
62
- - Dangerous treatment requests
63
- - Requests to bypass standard-of-care
64
- - Misuse of medical devices, drugs, or emergency procedures
65
 
66
- SafeMed-R1 shows top performance on **Med-Safety-Bench**, evaluated against strong models including GPT-4o, DeepSeek-V3, Qwen3-235B-A22B, etc. It is capable of “saying no” correctly and explaining why, with guideline- and regulation-based reasoning.
67
 
68
- ---
69
-
70
- ## How to Use (Transformers)
71
 
72
  ```python
73
  from transformers import AutoModelForCausalLM, AutoTokenizer
74
  import torch
75
 
76
  model_id = "OpenMedZoo/SafeMed-R1"
77
- tok = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
78
  model = AutoModelForCausalLM.from_pretrained(
79
- model_id,
80
- torch_dtype=torch.bfloat16,
81
- device_map="auto",
82
- trust_remote_code=True,
83
- )
84
-
85
- messages = [
86
- {"role": "system", "content": "You are SafeMed-R1, a cautious medical assistant. Provide safe, factual info and disclaimers. Do not provide diagnosis or treatment instructions."},
87
- {"role": "user", "content": "我最近总是头痛,应该怎么办?"},
88
- ]
89
- prompt = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
90
-
91
- outputs = model.generate(
92
- **tok(prompt, return_tensors="pt").to(model.device),
93
- max_new_tokens=512,
94
- temperature=0.6,
95
- top_p=0.9,
96
- do_sample=True,
97
  )
98
- print(tok.decode(outputs[0], skip_special_tokens=True))
99
- ```
100
-
101
- Tips:
102
- - Include a safety-focused system prompt (ethics, disclaimers, refusal policy)
103
- - For reasoning, prefer short, verifiable rationale summaries in production
104
-
105
-
106
- ## 🚀 Install & Deploy (vLLM)
107
-
108
- ### Step 1 — Install vLLM
109
-
110
- Follow the official vLLM installation guide to select the version compatible with your environment (GPU/CPU/TPU, CUDA/ROCm, PyTorch):
111
-
112
- - Docs: <https://docs.vllm.ai/en/latest/>
113
-
114
- ### Step 2 — Download Weights (Hugging Face)
115
-
116
- Pull the model weights from Hugging Face. You can reference the repo directly in vLLM, or pre-download locally:
117
-
118
- - Hugging Face: <https://huggingface.co/OpenMedZoo/SafeMed-R1/tree/main>
119
-
120
- **Option A (recommended): Use the Hugging Face repo name directly in vLLM**
121
-
122
- ```bash
123
- --model OpenMedZoo/SafeMed-R1
124
- ```
125
-
126
- **Option B: Pre-download locally (example)**
127
-
128
- ```bash
129
- pip install -U "huggingface_hub[cli]"
130
-
131
- huggingface-cli download OpenMedZoo/SafeMed-R1 \
132
- --local-dir ./models/SafeMed-R1
133
- ```
134
 
135
- Then set:
 
 
 
 
136
 
137
- ```bash
138
- --model ./models/SafeMed-R1
139
  ```
140
 
141
- ### Step 3 — Deploy (OpenAI-Compatible API Server)
142
-
143
- Minimal vLLM `serve` example:
144
 
145
  ```bash
146
- MODEL_PATH="${1:-./models/SafeMed-R1}"
147
- TENSOR_PARALLEL_SIZE="${2:-4}"
148
- PIPELINE_PARALLEL_SIZE="${3:-1}"
149
  PORT=50050
150
-
151
  vllm serve "$MODEL_PATH" \
152
  --host 0.0.0.0 \
153
- --port "$PORT" \
154
  --trust-remote-code \
155
  --served-model-name "safemed-r1" \
156
- --tensor-parallel-size "$TENSOR_PARALLEL_SIZE" \
157
- --pipeline-parallel-size "$PIPELINE_PARALLEL_SIZE" \
158
  --gpu-memory-utilization 0.9 \
159
  --disable-sliding-window \
160
  --max-model-len 4096 \
161
  --enable-prefix-caching
162
  ```
163
 
164
- For additional parameters (multi-GPU, performance tuning, quantization, server arguments), see:
165
-
166
- - <https://docs.vllm.ai/en/latest/>
167
-
168
- ---
169
-
170
- ## 📊 Evaluation
171
- ![Detailed Safety Results 1](https://github.com/OpenMedZoo/SafeMed-R1/safemed-r1-safety-results1.png)
172
- SafeMed-R1 surpasses its base and size-matched LLMs on both medical knowledge QA and safety/ethics, combining strong accuracy with robust safety alignment (see the associated radar plot in our paper/report).
173
- ![Safety Results](https://github.com/OpenMedZoo/SafeMed-R1/safemed-r1-safety-results.png)
174
- - **Knowledge QA and exams**: SafeMed-R1 leads or is competitive on MedQA-CN/TW, CMExam (val/test), PediaBench (MC), and CE-Phys/Pharm/Nurse, producing structured, step-by-step clinical reasoning (see Table 1 in the paper).
175
- ![Detailed Safety Results 3](https://github.com/OpenMedZoo/SafeMed-R1/safemed-r1-safety-results3.png)
176
- - **Safety and ethics**: SafeMed-R1 attains top or near-top results on MedSafety and MedEthics. On Med-Safety-Bench, SafeMed-R1 achieves consistently lower risk scores across evaluators (GPT-4o, DeepSeek-V3, Qwen3-235B-A22B), indicating stronger resistance to jailbreak/induction attacks and yielding a strong Overall-Average score (see Table 2).
177
-
178
- ---
179
-
180
- ## 🧪 Case Studies
181
- ![Case Example](https://github.com/OpenMedZoo/SafeMed-R1/safemed-r1-safety-case.png)
182
-
183
- In diverse safety-critical scenarios (e.g., biased/unsafe treatment requests, unnecessary invasive procedures), SafeMed-R1:
184
-
185
- - Reliably flags risk
186
- - Declines harmful actions
187
- - Offers guideline-aligned, conservative alternatives with clear patient counseling
188
-
189
- This demonstrates practical safety, explainability, and adherence to clinical standards.
190
-
191
- ---
192
-
193
- ## 🔒 Safety, Privacy & Compliance
194
-
195
- - Data anonymisation, authorisation protocols, and restrictions on secondary distribution
196
- - Adherence to local laws, regulations, and medical ethics guidelines
197
-
198
- **Disclaimer (English):**
199
- This project is intended solely for research and educational purposes and must not be used as a substitute for professional medical advice or diagnosis.
200
-
201
- **免责声明(中文):**
202
- 隐私与合规:数据脱敏、授权与二次分发限制;遵循本地法律法规与医学伦理准则。
203
- 本项目仅���科研与教育,不得替代专业医疗建议或诊疗。
204
-
205
- ---
206
-
207
- ## 🙏 Acknowledgements
208
-
209
- We thank the open-source community for toolchains and benchmarks, especially:
210
-
211
- - **LLaMA-Factory**: <https://github.com/hiyouga/LLaMA-Factory>
212
- - **MS-Swift**: <https://github.com/microsoft/MS-Swift>
213
-
214
- These projects greatly accelerated the construction and implementation of SafeMed-R1.
 
1
  # SafeMed-R1: A Trustworthy Medical Reasoning Model
2
 
3
+ <div align="center">
4
+ <a href="https://github.com/OpenMedZoo/SafeMed-R1" target="_blank">GitHub</a> |
5
+ <a href="#" target="_blank">Paper (coming soon)</a>
6
+ </div>
7
 
 
8
 
 
 
 
9
 
10
+ ## 1 Introduction
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
 
12
+ SafeMed-R1 is a medical LLM designed for trustworthy medical reasoning. It thinks before answering, resists jailbreaks, and returns safe, auditable outputs aligned with medical ethics and regulations.
13
 
14
+ - Trustworthy and compliant: avoids harmful advice, provides calibrated, fact-based responses with appropriate disclaimers.
15
+ - Attack resistance: trained with healthcare-specific red teaming and multi-dimensional reward optimization to safely refuse risky requests.
16
+ - Explainable reasoning: can provide structured, step-by-step clinical reasoning when prompted.
17
 
18
+ For more information, visit our GitHub repository:
19
+ https://github.com/OpenMedZoo/SafeMed-R1
20
 
21
  ---
22
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
 
24
+ ## Usage
 
 
25
 
26
+ You can use SafeMed-R1 in the same way as an instruction-tuned Qwen-style model. It can be deployed with vLLM or run via Transformers.
27
 
28
+ Transformers (direct inference):
 
 
29
 
30
  ```python
31
  from transformers import AutoModelForCausalLM, AutoTokenizer
32
  import torch
33
 
34
  model_id = "OpenMedZoo/SafeMed-R1"
 
35
  model = AutoModelForCausalLM.from_pretrained(
36
+ model_id, torch_dtype="auto", device_map="auto", trust_remote_code=True
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
  )
38
+ tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
 
40
+ messages = [{"role": "user", "content": "How to relieve a mild cough safely?"}]
41
+ inputs = tokenizer(
42
+ tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True),
43
+ return_tensors="pt"
44
+ ).to(model.device)
45
 
46
+ outputs = model.generate(**inputs, max_new_tokens=1024)
47
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
48
  ```
49
 
50
+ vLLM (OpenAI-compatible serving):
 
 
51
 
52
  ```bash
53
+ MODEL_PATH="OpenMedZoo/SafeMed-R1"
 
 
54
  PORT=50050
 
55
  vllm serve "$MODEL_PATH" \
56
  --host 0.0.0.0 \
57
+ --port $PORT \
58
  --trust-remote-code \
59
  --served-model-name "safemed-r1" \
60
+ --tensor-parallel-size 4 \
61
+ --pipeline-parallel-size 1 \
62
  --gpu-memory-utilization 0.9 \
63
  --disable-sliding-window \
64
  --max-model-len 4096 \
65
  --enable-prefix-caching
66
  ```
67