File size: 10,721 Bytes
2ad4762 4d550d7 2ad4762 76ba9ea 2ad4762 4d550d7 2ad4762 4d550d7 d36939d 2ad4762 d36939d 2ad4762 7f6e762 2ad4762 ad5778a 2ad4762 7f6e762 2ad4762 ad5778a 2ad4762 ad5778a 2ad4762 ad5778a 2ad4762 ad5778a 2ad4762 d36939d 2ad4762 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 |
---
base_model:
- TachyHealth/Gazal-R1-32B-sft-merged-preview
datasets:
- TachyHealth/medical_grpo
- TachyHealth/structured_medical
library_name: transformers
license: apache-2.0
license_link: https://huggingface.co/TachyHealth/Gazal-R1-32B-GRPO-preview/blob/main/LICENSE
pipeline_tag: text-generation
tags:
- gazal-r1
- grpo
- qwen3
- conversational
- medical
- clinical
- healthcare
- reasoning
---
# Gazal-R1-32B: Medical Reasoning Language Model
The model was presented in the paper [Gazal-R1: Achieving State-of-the-Art Medical Reasoning with Parameter-Efficient Two-Stage Training](https://huggingface.co/papers/2506.21594).
<a href="https://gazal.ai/" target="_blank" style="margin: 0px;">
<img alt="Gazal AI" src="./logo.png" style=" width: 70%;" />
</a>
## Model Highlights
Gazal-R1 is a state-of-the-art 32-billion-parameter language model specifically designed for medical reasoning and clinical decision-making. Built upon Qwen 3 32B, Gazal-R1 demonstrates that strategic training can enable mid-sized models to outperform significantly larger counterparts in specialized medical domains.
Key features include:
- **π¬ Medical Expertise**: Specialized training on 107,033 synthetic medical reasoning examples covering diagnostic reasoning, treatment planning, decision-making under uncertainty, and prognostic assessment
- **π§ Transparent Reasoning**: Structured clinical thinking with step-by-step explanations in `<think></think>` tags, following established clinical reasoning frameworks
- **π State-of-the-Art Performance**: Achieves 87.1% on MedQA, 81.6% on MMLU Pro (Medical), and 79.6% on PubMedQA, surpassing models up to 12Γ larger
- **β‘ Parameter Efficiency**: Advanced training techniques including Weight-Decomposed Low-Rank Adaptation (DoRA) and Rank-Stabilized LoRA (rsLoRA)
- **π― Alignment Optimization**: Refined through Group Relative Policy Optimization (GRPO) with sophisticated multi-component reward systems
- **π Medical Knowledge**: Comprehensive understanding across multiple medical specialties and clinical scenarios
## Model Overview
**Gazal-R1-32B** has the following characteristics:
- **Type**: Causal Language Model (Medical Reasoning Specialist)
- **Base Model**: Qwen 3 32B
- **Training Stages**: Two-stage pipeline (Supervised Fine-Tuning + Reinforcement Learning)
- **Number of Parameters**: 32.8B
- **Number of Parameters (Non-Embedding)**: 31.2B
- **Context Length**: 32,768 tokens natively, extensible to 131,072 with YaRN
- **Training Data**: 107,033 synthetic medical reasoning examples + [MedReason dataset](https://huggingface.co/datasets/UCSC-VLAA/MedReason) (32,682 examples)
- **Fine-tuning Method**: DoRA + rsLoRA (Parameter-Efficient Fine-Tuning)
- **Alignment**: Group Relative Policy Optimization (GRPO)
For detailed methodology, training insights, and comprehensive evaluation, please refer to our [technical report](https://arxiv.org/abs/2506.21594).
## Performance Results
Gazal-R1 achieves exceptional performance across standard medical benchmarks:
| Model | Size | MMLU Pro (Medical) | MedMCQA | MedQA | PubMedQA |
|-------|------|-------------------|---------|-------|----------|
| **Gazal-R1 (Final)** | **32B** | **81.6** | **71.9** | **87.1** | **79.6** |
| [Gazal-R1 (SFT-only)](https://huggingface.co/TachyHealth/Gazal-R1-32B-sft-merged-preview) | 32B | 79.3 | 72.3 | 86.9 | 77.6 |
| Llama 3.1 405B Instruct | 405B | 70.2 | 75.8 | 81.9 | 74.6 |
| Qwen 2.5 72B Instruct | 72B | 72.1 | 66.2 | 72.7 | 71.7 |
| Med42-Llama3.1-70B | 70B | 66.1 | 72.4 | 80.4 | 77.6 |
| Llama 3.1 70B Instruct | 70B | 74.5 | 72.5 | 78.4 | 78.5 |
| QwQ 32B | 32B | 70.1 | 65.6 | 72.3 | 73.7 |
| Qwen 3 32B | 32B | 78.4 | 71.6 | 84.4 | 76.7 |
**Key Achievements:**
- π₯ Highest scores on MMLU Pro (Medical), MedQA, and PubMedQA
- π Significant improvements from GRPO training (+2.3% on MMLU Pro, +2.0% on PubMedQA)
- π Outperforms models up to 12Γ larger (Llama 3.1 405B) on medical reasoning tasks
## Quickstart
### Basic Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "TachyHealth/Gazal-R1-32B-GRPO-preview"
# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
# Medical reasoning prompt
prompt = """A 65-year-old male presents with chest pain, shortness of breath, and elevated troponin levels.
ECG shows ST-segment elevation in leads II, III, and aVF. What is the most likely diagnosis and immediate management?"""
messages = [
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
# Generate response with medical reasoning
generated_ids = model.generate(
**model_inputs,
max_new_tokens=2048,
temperature=0.7,
top_p=0.8,
top_k=20
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
response = tokenizer.decode(output_ids, skip_special_tokens=True)
print("Medical Assessment:", response)
```
### Structured Medical Reasoning Format
Gazal-R1 is trained to provide structured medical reasoning in the following format:
```
<think>
Step 1: Analyze presenting symptoms - chest pain, dyspnea, troponin elevation
Step 2: Interpret ECG findings - ST elevation in inferior leads
Step 3: Consider differential diagnoses - STEMI vs NSTEMI vs unstable angina
Step 4: Identify culprit vessel - likely RCA given inferior lead changes
Step 5: Assess urgency - emergent intervention required
Step 6: Plan immediate management - dual antiplatelet, anticoagulation, cath lab
</think>
## Clinical Assessment
**Primary Diagnosis**: ST-Elevation Myocardial Infarction (STEMI), inferior wall
**Reasoning**: The combination of chest pain, elevated troponin, and ST-elevation in leads II, III, and aVF is pathognomonic for inferior STEMI, likely involving the right coronary artery (RCA).
**Immediate Management**:
1. **Reperfusion therapy**: Emergency cardiac catheterization with primary PCI
2. **Antiplatelet therapy**: Aspirin 325mg + P2Y12 inhibitor (clopidogrel/ticagrelor)
3. **Anticoagulation**: Heparin or bivalirudin
4. **Supportive care**: O2 if hypoxic, nitroglycerin for pain (avoid if hypotensive)
**Follow-up**: Post-PCI monitoring, echocardiogram, cardiac rehabilitation referral
```
## Training Methodology
### Stage 1: Supervised Fine-Tuning (SFT)
- **Dataset**: 107,033 synthetic medical reasoning examples + [MedReason dataset](https://huggingface.co/datasets/UCSC-VLAA/MedReason)
- **Techniques**: DoRA + rsLoRA with rank 256
- **Focus**: Structured clinical reasoning across diagnostic, therapeutic, and prognostic scenarios
### Stage 2: Group Relative Policy Optimization (GRPO)
- **Algorithm**: Value-function-free reinforcement learning
- **Dataset**: UltraMedical subset (32K medical MCQs)
- **Rewards**: Multi-component system (accuracy, format, length control, repetition penalty)
- **Improvements**: Enhanced reasoning quality and format adherence
## Model Capabilities
### Clinical Reasoning Types
1. **Diagnostic Reasoning**: Systematic symptom analysis β differential diagnosis
2. **Treatment Planning**: Evidence-based therapy selection with patient-specific factors
3. **Decision-Making Under Uncertainty**: Risk assessment and clinical judgment
4. **Prognostic Assessment**: Outcome prediction based on clinical evidence
### Medical Specialties Covered
- Internal Medicine
- Emergency Medicine
- Cardiology
- Pulmonology
- Infectious Disease
- Pharmacology
- Pathophysiology
- Clinical Laboratory Medicine
## Limitations and Important Disclaimers
### β οΈ Critical Safety Information
- **NOT A MEDICAL DEVICE**: Gazal-R1 is a research model and is **NOT** intended for direct clinical use, diagnosis, or treatment planning
- **REQUIRES PROFESSIONAL VERIFICATION**: All outputs must be independently verified by qualified medical professionals
- **NO REAL-TIME UPDATES**: Knowledge is static and does not reflect the latest medical research or guidelines
### Technical Limitations
- **Knowledge Cutoff**: Training data reflects medical knowledge up to the training date
- **Hallucination Risk**: May generate plausible-sounding but factually incorrect information
- **Evaluation Scope**: Primarily evaluated on multiple-choice questions; real-world clinical scenarios may differ
- **Regional Bias**: Training data may contain geographical or demographic biases
### Ethical Considerations
- **Professional Responsibility**: Final medical decisions must always rest with qualified healthcare providers
- **Accountability**: Users assume responsibility for verifying and appropriately applying model outputs
- **Patient Safety**: Never use for emergency medical situations or time-critical decisions
## Use Cases
### Research and Education
- Medical education and training
- Clinical reasoning research
- Medical knowledge assessment
- Academic medical writing assistance
### Professional Support (With Supervision)
- Literature review assistance
- Clinical case analysis support
- Medical documentation aid
- Differential diagnosis exploration
### NOT Suitable For
- Direct patient care
- Emergency medical decisions
- Replacing clinical judgment
- Unsupervised medical advice
## Citation
If you find Gazal-R1 helpful in your research, please cite our work:
```bibtex
@article{gazal-r1-2025,
title={Gazal-R1: Achieving State-of-the-Art Medical Reasoning with Parameter-Efficient Two-Stage Training},
author={Ahmed M. Adly and Mostafa Samy and Amr Fawzy},
journal={arXiv preprint arXiv:2506.21594},
year={2025},
url={https://arxiv.org/abs/2506.21594}
}
```
## Model Access
- **Model Weights**: Available on Hugging Face Hub
- **Datasets**: Training datasets available at [TachyHealth/structured_medical](https://huggingface.co/datasets/TachyHealth/structured_medical) and [TachyHealth/medical_grpo](https://huggingface.co/datasets/TachyHealth/medical_grpo)
<!-- - **Technical Report**: [arXiv:2505.09388](https://arxiv.org/abs/2505.09388) -->
## License
This model is released under the Apache 2.0 License. Please review the license terms before use.
## Contact
For questions about Gazal-R1, please contact:
- **Research Team**: TachyHealth
- **Website**: [https://tachyhealth.com/](https://tachyhealth.com/)
- **Gazal Platform**: [Gazal.ai](https://gazal.ai)
---
*Developed by TachyHealth Research Team. This model represents a significant advancement in medical AI reasoning while emphasizing the critical importance of professional medical oversight.* |