README.md · 360kaUser/Sanad-1.0 at main

File size: 14,613 Bytes

---
license: apache-2.0
language:
  - en
  - ar
tags:
  - medical
  - clinical-ai
  - medgemma
  - fine-tuned
  - diagnosis
  - differential-diagnosis
  - clinical-transcription
  - arabic-medical
  - qlora
  - healthcare
  - gemma3_text
base_model: google/medgemma-27b-text-it
datasets:
  - akemiH/NoteChat
  - starmpcc/Asclepius-Synthetic-Clinical-Notes
  - AGBonnet/augmented-clinical-notes
  - omi-health/medical-dialogue-to-soap-summary
  - openlifescienceai/medmcqa
  - GBaker/MedQA-USMLE-4-options
  - zhengyun21/PMC-Patients
  - lingshu-medical-mllm/ReasonMed
  - UCSC-VLAA/MedReason
  - FreedomIntelligence/medical-o1-reasoning-SFT
  - qiaojin/PubMedQA
  - appier-ai-research/StreamBench
  - MustafaIbrahim/medical-arabic-qa
  - MKamil/arabic_medical_50k
pipeline_tag: text-generation
library_name: transformers
model-index:
  - name: Sanad-1.0
    results:
      - task:
          type: text-generation
          name: Medical Question Answering
        dataset:
          type: GBaker/MedQA-USMLE-4-options
          name: MedQA USMLE
        metrics:
          - type: accuracy
            value: 87.7
            name: MedQA Accuracy
---

# 🏥 Sanad-1.0 — Clinical AI Assistant

<p align="center">
  <img src="https://img.shields.io/badge/Base_Model-MedGemma_27B-blue" alt="Base Model">
  <img src="https://img.shields.io/badge/Parameters-27B-green" alt="Parameters">
  <img src="https://img.shields.io/badge/Precision-BF16-yellow" alt="Precision">
  <img src="https://img.shields.io/badge/Training_Data-551K_examples-orange" alt="Training Data">
  <img src="https://img.shields.io/badge/Languages-English_|_Arabic-purple" alt="Languages">
  <img src="https://img.shields.io/badge/License-Apache_2.0-red" alt="License">
</p>

**Sanad-1.0** (سند — meaning "support" or "pillar" in Arabic) is a fine-tuned clinical AI model built on Google's [MedGemma-27B-text-it](https://huggingface.co/google/medgemma-27b-text-it). It is purpose-built for **Mediscribe**, a comprehensive clinical AI platform providing medical diagnosis, differential diagnosis, clinical transcription, and bilingual Arabic-English medical support.

Sanad-1.0 has been trained on **551,491 curated medical examples** across 15 specialized healthcare datasets using a **4-stage progressive fine-tuning pipeline** with QLoRA.

> **Try it live:** [Sanad-1 Demo](https://huggingface.co/spaces/360kaUser/Sanad-1) | [Sanad Demo](https://huggingface.co/spaces/360kaUser/sanad)

---

## ✨ Key Capabilities

| Capability | Description |
|------------|-------------|
| 🏥 **Clinical Transcription** | Converts doctor-patient conversations into structured SOAP notes and clinical documentation |
| 🔬 **Medical Diagnosis** | Analyzes patient presentations with systematic clinical reasoning to arrive at diagnoses |
| 📋 **Differential Diagnosis** | Generates ranked differential diagnoses with probability assessments and reasoning chains |
| 🧠 **Chain-of-Thought Reasoning** | Provides transparent, step-by-step medical reasoning with `<thinking>` traces |
| 🌍 **Arabic Medical Support** | Full bilingual capability (English + Arabic) for clinical consultations |
| 📚 **USMLE-Level Knowledge** | Trained on USMLE Step 1/2/3 questions across 21+ medical specialties |

---

## 🚀 Quick Start

### Using Transformers

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "360kaUser/Sanad-1.0"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

messages = [
    {"role": "system", "content": "You are Mediscribe, a clinical diagnostic AI assistant. Analyze the patient presentation and provide a diagnosis with clinical reasoning."},
    {"role": "user", "content": "A 55-year-old male presents with sudden onset crushing chest pain radiating to the left arm, diaphoresis, and shortness of breath. He has a history of hypertension, type 2 diabetes, and smokes 1 pack per day. ECG shows ST elevation in leads II, III, and aVF. Troponin I is elevated at 2.5 ng/mL."}
]

inputs = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)
outputs = model.generate(inputs, max_new_tokens=1024, temperature=0.7, top_p=0.9, do_sample=True)
response = tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True)
print(response)
```

### Using Unsloth (Faster Inference)

```python
from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    "360kaUser/Sanad-1.0",
    max_seq_length=2048,
    load_in_4bit=True,  # Use 4-bit for lower VRAM
)
FastLanguageModel.for_inference(model)

# Same message format as above
```

---

## 💬 Usage Examples

### 1. Clinical Transcription

```python
messages = [
    {"role": "system", "content": "You are Mediscribe, a clinical documentation AI assistant. Given a doctor-patient conversation, generate a comprehensive clinical note."},
    {"role": "user", "content": """Doctor: Good morning, what brings you in today?
Patient: I've been having this terrible headache for the past 3 days. It's mainly on the right side.
Doctor: On a scale of 1-10, how bad is the pain?
Patient: About 7. And I've been feeling nauseous too.
Doctor: Any visual changes? Sensitivity to light?
Patient: Yes, bright lights make it worse.
Doctor: Have you had migraines before?
Patient: My mother gets them, but I've never had one this bad."""}
]
```

**Output:** Generates a structured clinical note with Chief Complaint, HPI (onset, location, severity, associated symptoms), Review of Systems, Family History, Assessment, and Plan.

### 2. Differential Diagnosis

```python
messages = [
    {"role": "system", "content": "You are Mediscribe, a clinical diagnostic AI assistant. Provide a ranked differential diagnosis with reasoning."},
    {"role": "user", "content": "A 30-year-old female presents with fatigue, weight gain of 15 pounds over 3 months, cold intolerance, constipation, and dry skin. Hair thinning and difficulty concentrating. HR 58 bpm, BP 110/70, temp 97.2F."}
]
```

**Output:** Ranked differential including Hypothyroidism (most likely), Depression, Anemia, with clinical reasoning for each and recommended next steps (TSH, Free T4, CBC).

### 3. Arabic Medical Query

```python
messages = [
    {"role": "system", "content": "أنت Mediscribe، مساعد ذكاء اصطناعي طبي. قم بتحليل السؤال الطبي وتقديم إجابة دقيقة."},
    {"role": "user", "content": "ما هي أعراض مرض السكري من النوع الثاني؟"}
]
```

**Output:** Comprehensive Arabic response covering all Type 2 Diabetes symptoms (العطش الشديد، التبول المتكرر، الجوع المفرط, etc.) with medical terminology.

---

## 🏗️ Model Architecture

| Component | Detail |
|-----------|--------|
| **Base Model** | [google/medgemma-27b-text-it](https://huggingface.co/google/medgemma-27b-text-it) |
| **Architecture** | Gemma 3 (27B parameters) |
| **Context Window** | 128K tokens |
| **Tensor Type** | BF16 (Brain Float 16) |
| **Format** | Safetensors |
| **Fine-Tuning Method** | QLoRA (4-bit NF4 quantization during training) |
| **LoRA Configuration** | r=32, alpha=64, dropout=0.05 |
| **Target Modules** | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj |

---

## 📊 Training Data — 551K Medical Examples

### Stage 1: Clinical Transcription (152,930 train + 8,048 val)

| Dataset | Records | Purpose |
|---------|---------|---------|
| [NoteChat](https://huggingface.co/datasets/akemiH/NoteChat) | 60,000 | Doctor-patient conversation → clinical note |
| [Asclepius Clinical Notes](https://huggingface.co/datasets/starmpcc/Asclepius-Synthetic-Clinical-Notes) | 30,000 | Clinical note comprehension & QA |
| [Augmented Clinical Notes](https://huggingface.co/datasets/AGBonnet/augmented-clinical-notes) | 60,000 | Conversation → note + note → JSON summary |
| [OMI Health SOAP](https://huggingface.co/datasets/omi-health/medical-dialogue-to-soap-summary) | 9,250 | Medical dialogue → SOAP note |
| [MTS-Dialog](https://github.com/abachaa/MTS-Dialog) | 1,601 | Dialogue → clinical note sections |
| [ACI-Bench](https://github.com/wyim/aci-bench) | 127 | Ambient clinical intelligence |

### Stage 2: Medical Diagnosis (61,920 train + 3,258 val)

| Dataset | Records | Purpose |
|---------|---------|---------|
| [MedMCQA](https://huggingface.co/datasets/openlifescienceai/medmcqa) | 40,000 | MCQ across 21 specialties with explanations |
| [PMC-Patients](https://huggingface.co/datasets/zhengyun21/PMC-Patients) | 15,000 | Patient case narratives with diagnosis |
| [MedQA USMLE](https://huggingface.co/datasets/GBaker/MedQA-USMLE-4-options) | 10,178 | USMLE Step 1/2/3 clinical vignettes |

### Stage 3: Differential Diagnosis & Reasoning (129,696 train + 6,826 val)

| Dataset | Records | Purpose |
|---------|---------|---------|
| [ReasonMed](https://huggingface.co/datasets/lingshu-medical-mllm/ReasonMed) | 80,000 | Multi-step medical reasoning chains |
| [MedReason](https://huggingface.co/datasets/UCSC-VLAA/MedReason) | 32,682 | Knowledge-grounded clinical reasoning |
| [Medical O1 Reasoning](https://huggingface.co/datasets/FreedomIntelligence/medical-o1-reasoning-SFT) | 19,704 | Chain-of-thought `<thinking>` traces |
| [DDXPlus](https://huggingface.co/datasets/appier-ai-research/StreamBench) | 3,136 | Symptom → ranked differential diagnosis |
| [PubMedQA](https://huggingface.co/datasets/qiaojin/PubMedQA) | 1,000 | Evidence-based biomedical QA |

### Stage 4: Arabic & General Medical (179,373 train + 9,440 val)

| Dataset | Records | Purpose |
|---------|---------|---------|
| [Arabic Medical QA](https://huggingface.co/datasets/MustafaIbrahim/medical-arabic-qa) | 52,657 | Arabic medical QA (30+ specialties) |
| [Arabic Medical 50K](https://huggingface.co/datasets/MKamil/arabic_medical_50k) | 50,000 | Arabic medical dialogues |
| Existing Clinical Training Data | 86,156 | ChatDoctor, Indian Medical QA, combined clinical data |

---

## 🎯 Training Configuration

### 4-Stage Progressive Fine-Tuning

Training used **decreasing learning rates** across stages to progressively build capabilities while preventing catastrophic forgetting:

```
Stage 1: Transcription    → LR: 2.0e-4  │ 152,930 examples │ Clinical documentation
Stage 2: Diagnosis        → LR: 1.5e-4  │  61,920 examples │ Diagnostic reasoning
Stage 3: DDx & Reasoning  → LR: 1.0e-4  │ 129,696 examples │ Advanced reasoning
Stage 4: Arabic & General → LR: 5.0e-5  │ 179,373 examples │ Bilingual + reinforcement
```

### Hyperparameters

| Parameter | Value |
|-----------|-------|
| Batch size | 2 per device |
| Gradient accumulation | 8 steps |
| Effective batch size | 16 |
| Max sequence length | 2,048 tokens |
| LR scheduler | Cosine |
| Warmup ratio | 0.03 |
| Weight decay | 0.01 |
| Max gradient norm | 0.3 |
| Optimizer | Paged AdamW 8-bit |
| Precision | BF16 |
| Packing | Enabled |
| Gradient checkpointing | Unsloth optimized |
| Hardware | NVIDIA A100 40GB |

---

## 📈 Performance

### Evaluation Results

| Category | Quality | Description |
|----------|---------|-------------|
| Clinical Transcription | ✅ High | Structured SOAP notes with CC, HPI, ROS, Assessment, Plan |
| Medical Diagnosis | ✅ High | Systematic analysis with risk factors, ECG interpretation, clinical reasoning |
| Differential Diagnosis | ✅ High | Ranked DDx with probability reasoning and recommended next steps |
| Chain-of-Thought | ✅ High | Transparent `<thinking>` reasoning traces |
| Arabic Medical | ✅ High | Comprehensive Arabic responses with medical terminology |

### Base Model Benchmarks (MedGemma-27B)

| Benchmark | Score |
|-----------|-------|
| MedQA (USMLE) | 87.7% |
| EHRQA | 90.0% |
| Path-VQA | 72.2% |
| AfriMed-QA | 78.8% |

---

## ⚕️ Medical Specialties

Trained coverage across **21+ medical specialties**:

<table>
<tr><td>Cardiology</td><td>Neurology</td><td>Pulmonology</td><td>Gastroenterology</td></tr>
<tr><td>Endocrinology</td><td>Nephrology</td><td>Hematology</td><td>Oncology</td></tr>
<tr><td>Ophthalmology</td><td>Dermatology</td><td>Orthopedics</td><td>Pediatrics</td></tr>
<tr><td>OB/GYN</td><td>Psychiatry</td><td>Surgery</td><td>Emergency Medicine</td></tr>
<tr><td>Infectious Disease</td><td>Rheumatology</td><td>Radiology</td><td>Pathology</td></tr>
<tr><td>Pharmacology</td><td>Anatomy</td><td>Biochemistry</td><td>Forensic Medicine</td></tr>
</table>

---

## ⚠️ Limitations & Ethical Considerations

### Important Disclaimers

> ⚠️ **Sanad-1.0 is an AI assistant designed to support healthcare professionals. It is NOT a replacement for clinical judgment.**

- **Not for self-diagnosis.** Patients should always consult qualified healthcare providers.
- **Training data limitations.** May not represent all populations, conditions, or clinical settings equally.
- **Arabic coverage depth.** Arabic medical capabilities may not fully match English-language depth in all specialties.
- **No real-time data.** Does not access real-time medical literature, drug interaction databases, or patient records.
- **Potential for errors.** Like all AI models, Sanad-1.0 may produce incorrect or incomplete information.

### Intended Use

✅ Clinical decision support for licensed healthcare professionals
✅ Medical education and training
✅ Clinical documentation assistance
✅ Research and academic applications

### Out-of-Scope Uses

❌ Direct patient-facing medical advice without physician oversight
❌ Emergency medical decision-making as sole source
❌ Legal or forensic medical opinions
❌ Prescribing medications without physician review

---

## 📜 Citation

```bibtex
@misc{sanad1-2025,
  title={Sanad-1.0: A Fine-Tuned Clinical AI Model for Medical Diagnosis, Transcription, and Arabic Medical Support},
  author={360kaUser},
  year={2025},
  publisher={HuggingFace},
  url={https://huggingface.co/360kaUser/Sanad-1.0},
  note={Fine-tuned from google/medgemma-27b-text-it on 551K medical examples using 4-stage QLoRA}
}
```

---

## 🙏 Acknowledgments

- [Google Health AI](https://health.google/) — MedGemma base model
- [Unsloth](https://unsloth.ai/) — Efficient fine-tuning framework
- All dataset creators and contributors listed in the Training Data section
- The open-source medical AI community

---

<p align="center">
  <b>Sanad-1.0</b> — سند<br>
  <i>Your AI pillar of clinical support</i>
</p>